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We dedicate this book to two individuals, friends and colleagues, 
who, although they worked in very different disciplines, contributed 
greatly to our ability to model species occurrences. 


Dr. John Estes, geographer, visionary mapper of the planet's surface, 
and builder of bridges among disciplines, 

and 
Dr. Jared Verner, wildlife biologist, who pioneered the use of wildlife 
habitat models in land use planning. 
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Foreword 


Peter H. Raven 


One of the central problems in ecology is how pattern 
and scale influence the distribution and abundance of 
organisms that we see. We seek to understand and pre- 
dict the ecological causes and consequences of global 
climate change and of human-caused changes in the 
environment. There exists a great deal of literature, 
both basic and applied, that attempts to explain the 
patterns that we observe in nature and yet we continue 
to struggle with this important problem. Part of the 
difficulty lies in the fact that there is no single funda- 
mental scale at which ecological phenomena should be 
studied (Levin 1992). Ecosystems and their associated 
populations vary over a range of spatial, temporal, 
and organizational scales, and the mechanisms driving 
the patterns we see may be operating at very different 
levels. Every ecosystem exhibits variability and patchi- 
ness on a variety of levels, and each interacts with 
other systems to promote or inhibit the phenomena of 
persistence or coexistence among species (Chesson 
1986; Levin 1992). To address questions regarding 
population phenomena, we need to find ways to quan- 
tify these patterns in time and space, to understand 
how pattern may change with scale, and to understand 
the causes and consequences of the patterns we see 
(Wiens 1989b). 

The use of descriptive statistics is a starting point 
for understanding pattern, but correlations are not a 
substitute for mechanistic understanding (Lehman 
1986). We now recognize there is no single correct 
scale or level at which to describe a system, and also 
that not all scales serve equally well for describing pat- 
terns (Levin 1992). In the absence of a complete inven- 
tory of the distribution and abundance of the world's 
species (Raven and Wilson 1992), a problem that will 
not soon be resolved, models are one way to gain and 
apply information about the patterns of the distribu- 
tion of biodiversity in space and time. But we must 


have at hand a variety of models with different levels 
of complexity that will allow us to determine the ap- 
propriate levels of aggregation and simplification for 
each question (Levin 1992). As practitioners of the sci- 
ence of conservation biology, we recognize that timely 
policy and management decisions must be made in the 
face of uncertainty. To accomplish this, we commonly 
now rely on remote sensing and spatial statistics to 
quantify patterns at broad scales, but we still need a 
great deal of experimental evidence to provide infor- 
mation about the mechanisms driving the patterns. Fi- 
nally, we need to provide measures of the accuracy of 
our predictions so that we can make these decisions 
with some level of confidence. 

The thrust of the conference, Predicting Species Oc- 
currences: Issues of Scale and Accuracy, from which 
this book was developed, was to discuss the status of 
our ability to model species distributions with regard 
to concerns about pattern and scale. This volume is a 
compendium of research that defines the current state 
of our knowledge of modeling species occurrences; it 
reiterates the theoretical and basic ecological needs re- 
maining as we try to understand and predict species 
occurrences and distribution. This book is a continua- 
tion of the major effort conducted in 1984 that culmi- 
nated in Wildlife 2000: Modeling Habitat Relation- 
ships of Terrestrial Vertebrates (Verner et al. 1986b)— 
an assessment of the state of our knowledge fifteen 
years ago. This current volume evaluates the progress 
we have made since the publication of Wildlife 2000. 
In it, the authors reiterate the need for continued sup- 
port of basic and applied ecological studies of pattern 
and scale. Beyond that, the authors emphasize the 
need for measures of model accuracy when policy and 
management decisions are involved. 

I hope that this book will help to build stronger 
bridges between those scientists who conduct basic 
ecological studies in the field and their colleagues who 
use field data to create increasingly accurate and use- 
ful model predictions. 
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Preface 


Michael E. Goodchild 


In late 1989, I met Brad Parks at an EPA (Environ- 
mental Protection Agency)-sponsored workshop near 
Detroit, Michigan, and we persuaded each other that 
much could be gained by encouraging greater interac- 
tion between the geographic information system (GIS) 
and environmental modeling communities. GIS tech- 
nologies were fast gaining recognition among scientists 
and policy makers as tools for the analysis of problems 
in a spatial context. Environmental models were also 
proliferating and winning acceptance as approaches 
for numerical analyses and prediction. If they could be 
integrated, these applications offered great promise in 
bridging some of the gaps between scientific research 
and policy development. Out of that conversation 
grew a series of International Conferences/Workshops 
on Integrating GIS and Environmental Modeling, in 
Boulder (1991), Breckenridge (1993), Santa Fe (1996), 
and Banff (2000). (Proceedings are available, respec- 
tively, as Goodchild et al. (1993), Goodchild et al. 
(1996), http://www.ncgia.ucsb.edu/conf/SANTA FE . 
CD-ROM/main.html, and http://www.colorado.edu/ 
research/cires/banff/.) It was extremely satisfying to see 
our vision broadened and enriched at the Predicting 
Species Occurrences: Issues of Scale and Accuracy con- 
ference in Snowbird, Utah, in October 1999. 

Modern science and problem solving require exten- 
sive collaboration between specialists, and the high 
costs of data acquisition and model development 
argue strongly for effective mechanisms for sharing in- 
formation. Effective communication is particularly dif- 
ficult across the boundaries of disciplines that may use 
the same terms in different ways. One important com- 
ponent of community building is the construction of a 
common language (see particularly the arguments pre- 
sented by Michael Morrison and Linnea Hall in Chap- 
ter 2). Internet technologies have helped enormously 
to foster a culture of collaboration and sharing among 
scientists. At the same time, these developments have 
pointed to some critical weaknesses in our ability to 
deal effectively with pervasive problems among GIS 


and environmental modeling processes, including tem- 
poral variability, scale, representation, accuracy, visu- 
alization, and spatial context (for a comprehensive dis- 
cussion, see the research themes of the University 
Consortium for Geographic Information Science, 
UCGIS [1996] and http://www.ucgis.org). These 
themes form the core of this new book. 

One weakness in our ability to predict species oc- 
currences is rooted in the development of accurate 
models and is a major topic of this volume. The prob- 
lem with accuracy stems from the fact that the real ge- 
ographic world is infinitely complex such that it 
would require an infinite and therefore impossible 
amount of information to characterize it completely. 
Any description of any aspect of the world must there- 
fore be an approximation, generalization, or abstrac- 
tion that almost certainly omits much of the detailed 
information that organisms sense and respond to. This 
missing information constitutes uncertainty, in the 
sense that a user of a database has a degree of uncer- 
tainty about the true conditions existing in the real 
world (for a review of uncertainty in spatial ecology 
see Hunsaker et al. 2001). It also follows that an infi- 
nite number of potential approximations exist or, in 
other words, that an infinite number of ways can be 
found to represent any aspect of the world in digital 
form. Some of these will do much more damage to our 
ability to model and predict than others, because their 
associated uncertainty translates or propagates into 
uncertainty over model predictions. Finding the met- 
rics that allow us to evaluate a representation's per- 
formance in environmental modeling and prediction is 
a vitally important task. The metric long favored by 
cartographers and producers of geographic data- 
bases—the scale or representative fraction of a paper 
map—is insufficient. There are still an infinite number 
of possible digital representations at any given scale 
depending on how the objects represented in the data- 
base are specified. Advances in computing power over 
the past few decades have made it possible to handle 
much larger volumes of data. We have the ability to 
model more accurately, as Dean Stauffer argues in 
Chapter 3, and as many other chapters in this book 
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demonstrate. However, given that variability in data 
stemming from unknown sources ultimately limits our 
ability to reduce the uncertainty in our predictions, au- 
thors in this volume demonstrate the importance of 
providing a measure of accuracy for all our future 
modeling and mapping efforts. 

Geographic information systems have inherited rep- 
resentations of the world developed by cartographers. 
Today, we have computers capable of capturing not 
only how the world looks, in the form of largely static 
geographic data, but also how it works, in the form of 
codes that implement environmental process models. 
We must recognize that the data populating our GIS 
databases today were largely created to appear on 
maps, and many of them were obtained by digitizing 
or scanning those same maps. They use the scales and 
accuracies that were devised long ago to serve the 
needs of mapping. Vegetation-cover mapping practice, 
which emphasizes the delineation of homogenous 
areas of cover class separated by sharp boundaries, 
made sense to users of maps, but it may be totally in- 
appropriate for today’s GIS uses, including dynamic 
modeling of ecological processes. The concept of a 
patch as an ecologically meaningful unit may appear 
superficially similar to the concept of an area of uni- 
form cover class, but at a more fundamental level they 
may have little relationship. Similarly, the pixels of a 
remotely sensed image may appear to be convenient 
units for modeling ecological process, but they origi- 
nated in the geometry of an instrument on a satellite 
and were in no way informed by ecological reality. So 
bringing GIS and environmental modeling together of- 
fered promise at several levels: improved support for 
environmental modeling through better tools for shar- 
ing and managing data and for visualizing and dissem- 
inating results, and also improved representations in 
GIS that were more appropriate for a new generation 
of requirements. 

Carol Johnston (1993) presented an appealing con- 
ceptual structure for spatially explicit modeling of eco- 
logical populations that was firmly grounded in the- 
ory. She postulated that K strategists base their 
survival and success on local resources and are rela- 
tively easy to model in GIS, provided local resources 
are represented in suitable ways. Because organisms 
respond to resources over a local neighborhood, in ef- 


fect integrating through a spatial convolution func- 
tion, it is essential that the representation have an ap- 
propriate spatial resolution that is no coarser than the 
convolution, if the convolution is precomputed, and 
much finer if the convolution must be computed on- 
the-fly by the model. Vector representations, which 
lack explicit spatial resolution, are likely to be unsuit- 
able in these models. 

On the other hand, r strategists present much more 
difficult problems. Here, success is determined by the 
intrinsic rate of natural increase of the population and 
hence on spatial interaction between organisms. How- 
ever, spatial interaction presented enormous problems 
to cartographers, and very little progress has been 
made in the development of effective representation of 
interactions in GIS databases. Multiagent models at- 
tempt to model interaction at the individual level and 
clearly do not scale effectively to large populations. Is- 
land models deal with interaction by assuming it to be 
perfect within the island and absent otherwise, and it 
is tempting to think that landscapes can be similarly 
fragmented into isolated patches. But simple rules for 
the identification of patches (e.g., every watershed is a 
patch, every area of homogeneous land cover class is a 
patch, every Landsat pixel is a patch) are unlikely to 
have much grounding in ecological reality. Jason Dun- 
ham, Bruce Rieman, and James Peterson explore this 
theme in relation to modeling fish occurrences in 
Chapter 26; in Chapter 59, David Theobald and 
Thompson Hobbs present an alternative approach 
that explicitly addresses gradients; and in Chapter 63 
Thomas Sisk, Barry Noon, and Haydee Hampton cre- 
ate a patch model that explicitly recognizes within- 
patch heterogeneity. 

The growth of the World Wide Web and associated 
data archives, digital libraries, and clearinghouses 
have created an environment conducive to data shar- 
ing and access—environmental modelers are now 
blessed with an abundant supply of geographic data. 
However, this series of technological developments has 
in turn raised awareness of the weaknesses of much of 
the data supply. Too often, the rules used to create 
data are not known, or are too subjective, or data 
have not been effectively validated, or quality is too 
uncertain and variable. Too often, the high cost of ac- 
curate data forces us to accept low-resolution data 


without adequate ways of assessing what has been lost 
(see, for example, the analysis presented by Kathryn 
Thomas and her coauthors in Chapter 10). Using data 
that are too coarse can affect not only the goodness of 
fit of the model, but also its structure, as Catherine 
Johnson and her coauthors show in Chapter 12. In the 
best of all possible worlds, questions of structure 
would be resolved by resorting to theory, but in Chap- 
ter 15 Claudine Tobalske shows how alternative theo- 
ries can often lead in conflicting directions. 

New acquisition systems are solving some of these 
problems—for example, the data becoming available 
from the Shuttle Radar Topography Mission of 2000 
clearly have some advantages over previous digital ele- 
vation data, and new satellite sensors offer improved 
spatial, temporal, and spectral resolution. There has 
also been substantial improvement in software support 
for modeling. GISs aimed specifically at dynamic mod- 
eling have appeared, such as the University of 
Utrecht’s PCRaster (http://www.geog.uu.nl/pcraster/), 
and powerful scripting languages have been defined 
(Van Deursen 1995). Many of the chapters in this vol- 
ume illustrate the benefits of these changes, and the 
software environments described in the chapters of 
Part 4, “Predicting Species Presence and Abundance,” 
are very different from those available ten years ago. 

The world has invested vast resources over the cen- 
turies in creating representations of geographic knowl- 
edge that are in many cases highly inappropriate for 
ecological modeling. We need a new generation of rep- 
resentations that are specifically designed to make eco- 
logical sense and to do minimal damage to the accu- 
racy of ecological models that use them for prediction. 
Specifically representations could 


* Maintain explicit spatial resolution (rasters are 
preferable in this respect to vectors) 

e Support multiple representations (e.g., preserve both 
raw data and more generalized interpretations 
based on them) to implement reasoning and model- 


Preface xvii 


ing across spatial scales, a theme explored by Brian 
Maurer in Chapter 9 and other chapters in Part 2, 
“Temporal and Spatial Scales” 

* Interrupt continuity at ecologically significant barri- 
ers rather than in areas of steep gradient (e.g., 
bounding areas of approximately homogeneous 
cover) 

e Support the representation of interaction (e.g., 
through vector fields, metamaps [Takeyama and 
Couclelis 1997], or attributed relationships) 

* Support the measurement and modeling of uncer- 
tainty, and its propagation through models to confi- 
dence limits on outputs 

* Support the observation and modeling of change 
through representations that are fully spatiotempo- 
ral (e.g., define objects as regions in three-dimen- 
sional space-time, or as space-time trajectories) 


All of these examples are part of a much longer- 
term transition as GIS evolves away from its carto- 
graphic roots and moves to center stage in support of 
environmental modeling. 

They are some of the important changes that will be 
needed in the future as tools evolve to support the 
needs of better environmental modeling and species 
prediction. The chapters in this volume demonstrate 
the state of our knowledge regarding environmental 
modeling and the progress being made in linking such 
models with GIS technologies. They point the way for 
future investigations and identify problems that will 
need resolution before more progress can be made. To- 
gether they demonstrate our progress toward further 
developing and integrating sound reasoning and pre- 
diction across multiple spatial scales, developing repre- 
sentations that make ecological sense, measuring and 
reporting model accuracy, incorporating measures of 
uncertainty, and supporting the observation and mod- 
eling of change over time and at multiple spatial 
scales. We have come a long way in the past decade, 
but there is much opportunity for further progress. 
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Introduction 


J. Michael Scott, Patricia J. Heglund, 
Michael L. Morrison, Jonathan B. Haufler, 
Martin G. Raphael, William A. Wall, and 
Fred B. Samson 


The complexity and inherent variation in species and 
their responses to physical and biological factors at 
multiple scales as well as the dynamic nature of envi- 
ronments and species ranges (Botkin 1990) make pre- 
dicting species occurrence, abundance, or viability 
with high levels of precision and accuracy difficult. 
Predictive models are by necessity, low-dimensional 
abstractions of n-dimensional forces acting on individ- 
uals. Our attempts to predict species occurrences have 
been hampered by mismatches between the spatial 
and temporal scales at which we make measurements 
and the scale at which ecological phenomena influence 
patterns of species occurrences and an incomplete 
knowledge of a species’ life requirements. In many re- 
spects we are still in the age of exploration and dis- 
covery that was enjoyed by nineteenth-century natu- 
ralists. Consider that one hundred of the breeding bird 
species of North America are each the subjects of 
fewer than five publications in the refereed literature 
(J. T. Ratti and J. M. Scott personal communication). 
Of those species whose ecology and biology are well 
documented, there is a clear bias toward game species 
(Boone and Krohn, Introduction to Part 3 this volume; 
Karl et al. 1999). This lack of information on natural 
history of species severely limits our ability to confi- 
dently predict species occurrence. 

Factors affecting species occurrences, abundance, 
and population viability in time and space occur along 
gradients but are most often modeled as discrete vari- 
ables (Lawton et al. 1994). Conditions in one area of 
a species range (e.g., massive die-off on wintering 
grounds, weather conditions during migration) may 
determine a species occurrence, abundance, or repro- 
ductive rates (Beard et al. 1999) at another (Wiens 
and Rotenberry 1981b). Our ability to predict with 
confidence a species status in an area is further com- 


plicated by their often-nonlinear responses to habitat 
(Heglund, Chapter 1; Austin, Chapter 5). The size and 
ecological. context of habitat patches influence not 
only whether a species occurs there but possibly many 
of its demographic values as well. Additionally, condi- 
tions occurring at several spatial scales can influence 
species occurrence and associated ecological processes 
(Wiens 19892). All of these factors as well as previous 
events (e.g., pollution, drought) influence a species' 
occurrence. Areas occupied one year may be unoccu- 
pied the next with seemingly no detectable differences 
in the habitat conditions at the site. Nature is indeed 
more complex than we can envision. 

Modeling efforts are often conducted in response to 
the needs of society. The resultant science-policy inter- 
face is a world in which many of us are uncomfort- 
able. Nonetheless, if we are to meet the needs of soci- 
ety we must be able to effectively communicate the 
results of our research to local, state, and national 
managers and policy makers. These individuals are re- 
quired to make day-to-day decisions often under se- 
vere budget constraints and deadlines and with incom- 
plete and highly uncertain information. They require 
answers for questions that range from a fine-scale 
"What's the impact of different timber harvest regimes 
on red-cockaded woodpeckers (Picoides borealis) in a 
40-hectare forest?" to the coarse-scale *Which areas 
of a sagebrush steppe are most important to the main- 
tenance of regional biodiversity?" Predicting species 
responses to such scenarios requires researchers to 
think at a variety of spatial and temporal scales and at 
different levels of biological organization (demes, pop- 
ulations, metapopulation, subspecies, and species). 
Models are often developed for single species for spe- 
cific areas tens of hectares in size, less often for the 
range of a species, and almost always independently of 
ecological process or context. 

All too often there is a disconnect between re- 
searchers and managers. It is not unusual for man- 
agers to attempt to apply the results of a model devel- 
oped to predict a species response across millions of 
hectares to a 10-hectare site without fully recognizing 
the limits of the model. To avoid such misuses all 
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parties must become better communicators. Re- 
searchers must conduct research at scales that are ap- 
propriate to the question being asked and the manage- 
ment challenge at hand. Products must be presented 
not only in the peer-reviewed literature, where they 
are subject to the scrutiny of colleagues, but also in 
formats that are more meaningful to managers (e.g., 
workshops, one- or two-page research information 
brochures, interviews, and video tapes). Ideally scien- 
tists should work with managers to interpret manage- 
ment implications of the research and clearly state po- 
tential applications along with any: inherent 
limitations of the models. 

Demographic units above the level of individuals 
and demes rarely can be found within a single game 
management unit, nature reserve, or county, while few 
species spend their lives within a single country. Man- 
agers are increasingly responsible for more species of 
concern and for a wider variety of taxa. Whereas their 
attention was formerly focused most often on charis- 
matic megafauna or on those species hunted or fished 
for sport or commercial gain, managers must now ad- 
dress the conservation of a broad representation of 
taxonomic entities including vascular and nonvascular 
plants, fungi, invertebrates, and lesser-known verte- 
brates. The proposed regulations under the National 
Forest Management Act and the Criteria and Indica- 
tors emerging from the Montreal Process meeting held 
in 1995 (Anonymous, undated, http://www.mpci.org/ 
whatis/criteria e.html) are two examples of this 
broader mandate to manage and conserve biological 
diversity. The Montreal Process identified seven crite- 
ria to provide member countries with a common defi- 
nition of what characterizes sustainable management 
of temperate and boreal forests. The seven criteria are 
(1) conservation of biological diversity; (2) mainte- 
nance of productive capacity of forest ecosystems; (3) 
maintenance of forest ecosystem health; (4) conserva- 
tion and maintenance of soil and water resources; (5) 
maintenance of forest contribution to global carbon 
cycles; (6) maintenance and enhancement of long-term 
multiple socioeconomic benefits to meet the needs of 
society; and (7) legal, institutional, and economic 
framework for forest conservation and sustainable 
management. There are sixty-seven associated indica- 
tors for these seven criteria, many of which will re- 


quire modeling species occurrences. These comprehen- 
sive mandates have brought into sharp focus the de- 
bate over the proper balance between species-based 
(fine filter) and ecosystem-based (coarse filter) ap- 
proaches to conservation. And in either case, develop- 
ment of reliable models for predicting system re- 
sponses to environmental change is essential. 

Increasingly, management efforts are conducted in 
partnership across traditional political boundaries in 
the United States (Cox et al. 1994, Scott et al. 1993, 
Stein et al. 2000) and elsewhere (Brunckhorst 2000). 
Management and policy decisions require predictions 
of species occurrences across the range of species. Our 
ability to do this requires that we work at multiple 
levels of physical and temporal resolutions, incorpor- 
ating ecology, wildlife biology, and biogeography. 
There is increasing recognition that environmental 
planning is best done when the full range of a species 
occurrence is considered (Heglund, Chapter 1; 
Heglund et al. 1994). Yet most efforts are site specific. 
Additionally, we most frequently model presence/ab- 
sence and abundance despite indications that these are 
likely poor indicators of habitat quality (Van Horne 
1983). But, as evidenced by papers presented at this 
symposium, we are making progress. We now find ex- 
amples of better approaches to modeling species oc- 
currences and to developing demographic models that 
go beyond presence/absence. 

The international symposium Wildlife 2000: Mod- 
eling Habitat Relationships of Terrestrial Vertebrates 
(Verner et al. 1986b) set the stage in 1984 for periodic 
review of our progress on modeling species occur- 
rences. With the increased awareness of the shortfalls 
of current modeling efforts we invited scientists and 
managers from government, private businesses, and 
academia to a symposium to assess what has been ac- 
complished and what hurdles to advancement remain 
in our attempts to understand and predict the distribu- 
tion and abundance of species. 

Fifteen years later, 325 individuals from fourteen 
countries assembled at Snowbird, Utah, October 
18-22, 1999, to discuss advances in our efforts to 
model species distributions since Wildlife 2000: Mod- - 
eling Habitat Relationships of Terrestrial Vertebrates 
(Verner et al. 1986b). These individuals came from a 
variety of disciplines (geography, ecology, botany, etc.) 


and age classes. Approximately 20 percent of the par- 
ticipants were students. 

In seeking participants for the symposium Predict- 
ing Species Occurrences: Issues of Scale and Accuracy, 
we made a special effort to ensure participation of in- 
dividuals with different approaches to modeling. It 
was our hope that doing so would stimulate discus- 
sion, sharing of ideas, and future collaborations. Our 
goal in hosting this symposium was to examine the 
theoretical basis for model development, assess the 
degree to which assumptions were met, discuss the 
current applications of modeling techniques, and 
focus on the future of modeling for natural resource 
conservation and management. We paid particular at- 
tention to the various ecological and behavioral char- 
acteristics of species related to their distribution and 
abundance, problems related to scale (extent and 
grain) of various phenomena (e.g., time, space, habi- 
tat), problems with current methods of model valida- 
tion, and how all these affect model performance and 
reliability. Our objectives were to review the theory 
and practice of predicting species distribution, abun- 
dance, and viability; to develop our understanding of 
management and policy frameworks within which 
models are applied; to identify the strengths and 
weaknesses in the application of various modeling 
techniques; and to examine the potential for linking 
models to one another for large-area management 
programs. The following chapters are representative 
of the variety of topics covered at Snowbird. You will 
recognize that there are areas of emphasis that are 
underrepresented relative to their importance in pre- 
dicting species occurrences. These inequalities should 
provide a direction for future efforts. Because of the 
large overlap among authors in references cited, we 
have merged the literature cited section of each chap- 
ter into a single Literature Cited section at the back of 
the book. In addition to saving space, it provides a 
solid introduction to the literature on modeling 
species occurrences. You will see much innovation in 
the work presented herein, and hopefully you will be 
moved to follow one or more of these pathways to an 
ever-increasing effort to conserve our natural re- 
sources for future generations. 

Past efforts to model species occurrences have been 
biased toward vertebrates—especially birds (Verner et 
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al. 1986b) and traditional game species (Dettmers et 
al., Chapter 54; Karl et al. 1999—so we sought out 
those working on other taxa to build diversity into 
this conference. Despite this effort, vertebrates, espe- 
cially birds, were the most frequently modeled group, 
just as they were in Wildlife 2000, and so they are the 
most frequently discussed group in this book. Happily, 
however, you will still find numerous discussions of 
fungi, plants, butterflies, small mammals, and amphib- 
ians in the pages that follow. 

A recurring theme among these papers is the diffi- 
culty of assessing the accuracy of species occurrence 
models. Sample sizes of several hundred independent 
observations may be needed, (Karl et al., Chapter 51) 
a difficult task when dealing with less-common 
species. It is a task made even more daunting when we 
realize that most species are uncommon (Preston 
1948, 1962a,b). Litigation over land use practices re- 
quires that we use defensible methods of data acquisi- 
tion and measures of the reliability of our predictions 
of species presence, absence, abundance, and viability. 
Thus, considerations of accuracy and scale are impor- 
tant if we are to understand the relationships between 
species and their environment. 

It is these issues (scale and accuracy) that we asked 
the participants of the Snowbird symposium to ad- 
dress. One of the most compelling ideas resulting from 
this effort was the determination that no silver bullet 
or single model will suffice. Different species and dif- 
ferent ecological settings require different approaches. 

The chapters in this book were selected from the 
eighty-two oral and 125 poster presentations given at 
the conference. All manuscripts were sent out for re- 
view by two or more referees and fifty-nine of those 
manuscripts may be found in the pages of this book. 
The manuscripts are grouped into six parts. Within 
each part, species are placed in phylogenetic order. 
Each part is preceded by an introduction in which the 
contents of the chapters are discussed, their limita- 
tions and strengths presented, and future directions to 
improve the accuracy and precision of modeling the 
patterns and processes associated with species occur- 
rences are noted. 

In Part 1, “Conceptual Framework,” the history 
of the theory and practice of modeling is reviewed and 
a standard terminology is presented. This part is 
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preceded by a detailed discussion of the ecological 
context in which predictions of species occurrences 
are made and our struggles to understand their influ- 
ence. In Part 2, *Temporal and Spatial Scales," one of 
the most difficult topics dealt with at the symposium is 
discussed in terms of its influence on patterns and 
processes of species distributions. A recurring theme 
throughout this part is *What are the consequences of 
asking questions at the wrong scale?" There are de- 
tailed discussions of state-of-the-art modeling tools 
and descriptions of methods for assessing model accu- 
racy in Part 3, *Modeling Tools and Accuracy Assess- 
ment." Two important discussions in this part are by 
Alan Fielding, who considers what the appropriate 
characteristics of an accuracy measure might be, and 
the disquieting findings by Jason Karl and his col- 
leagues that statistically defensible assessments of ac- 
curacy require a very large number of independent ob- 
servations (Karl et al., Chapter 51). Efforts as diverse 
as Generalized Linear Modeling, Generalized Additive 
Modeling, General Algorithm for Rule-set Prediction 
and Neural Network modeling used to predict 
presence/absence and abundance of species are dis- 
cussed in Part 4, “Predicting Species Presence and 
Abundance." In Part 5, *Predicting Species: Popula- 
tions and Productivity," several examples of how spa- 
tially explicit data on demographics can provide im- 
portant information for managers are presented. 
Finally, in the concluding part, *Future Directions," 
John A. Wiens provides an in-depth review and syn- 
thesis of the symposium. Wiens provides guidance for 
future directions and cautions regarding misuse of 
models. 

You will find two underrepresented topics in this 
volume. They are vegetation classification systems and 
the use of spatial statistics in modeling. The selection 
of an appropriate classification system is of critical im- 
portance to predicting species occurrence. Most habi- 
tat models depend on a map of biologically meaningful 
areas delineated across a landscape as the basis for 
making predictions of species occurrence, abundance, 
or fitness. Failure to address the effects of specific clas- 
sification systems and the quality of their associated 
habitat attribute data can result in major errors in 
model performance. Whereas several papers in the 
symposium did address the question of the resolution 


of the mapped units used in model predictions, the 
topics of classification system effects and habitat 
attribute quality and accuracy were not adequately 
addressed. This, then, represents an area where signi- 
ficantly more work may be needed. Further, failure to 
adequately address these factors will result in the ac- 
curacy of the relationships described in many models 
being questioned, when these relationships may in fact 
be valid but are being masked by our inability to 
describe the actual location of quality habitat within 
the landscape. 

Despite the usefulness of spatial statistics for mod- 
eling species distributions, there was little discussion 
of their use at Snowbird or in the pages of this book. 
We think that their widespread use by biologists 
awaits the development of user-friendly programs in 
widely available statistical packages (e.g., SAS, SPSS). 
Just as this volume used the results of Wildlife 2000 as 
its foundation, it is our hope that it will serve as a 
foundation for the next generation of wildlife habitat 
relationship models. 
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Introductory Essay: 


Critical Issues for Improving Predictions 


Micbael A. Huston 


Natural resource management and conservation de- 
pend on accurate information about the distribution 
and response dynamics of natural resources, which in- 
clude populations of plant and animals that can 
change dramatically over time and space. Unfortu- 
nately, the complex interactions that typify ecological 
processes have been an impediment to understanding 
and predicting these shifting patterns. Theories about 
the effects of a specific environmental condition on 
some ecological property have been difficult to test be- 
cause many different environmental conditions are 
correlated with any specific property. Such covariance 
also hinders the development of predictive statistical 
models because of the difficulty of distinguishing the 
“causal” predictor from other correlated factors that 
have no causal role. Nonlinear response dynamics can 
lead ecologists to opposite conclusions about the re- 
sponse of an ecological property to environmental 
conditions. Finally, even the statistical analyses used to 
quantify these patterns don't work very well. Statisti- 
cal methods based on the standard assumptions of lin- 
ear relationships, normally distributed *errors," and 
uncorrelated independent variables, can actually pre- 
vent the identification of the *functionally significant" 
relationships necessary to understand ecological 
processes and predict ecological patterns. 

The three primary impediments to developing a co- 
herent conceptual framework for ecological predic- 
tions are closely interrelated: (1) mismatches between 
the spatial and temporal dimensions of ecological 
measurement and the dimensions at which hypothe- 
sized processes operate, (2) misunderstanding of eco- 
logical processes, and (3) use of inappropriate statis- 
tics to quantify ecological patterns and processes. 
Each of these problems stems from the failure of ecol- 
ogists to develop a coherent set of theories that apply 
at clearly defined spatial and temporal dimensions. Al- 
though it is beyond the scope of any single chapter to 


develop or present such a theoretical framework, it 
may be useful to look at examples of how new ap- 
proaches in each of these areas can lead to more pow- 
erful insights than those derived from traditional 
ideas. 

After defining the spatial and temporal require- 
ments for ecological sampling, this introductory essay 
will discuss three examples of theories that are rele- 
vant to the distributions of organisms and the struc- 
ture of ecological communities. Each theory describes 
how certain processes are expected to produce partic- 
ular ecological patterns. Each theory can be identified 
as being relevant to patterns that appear at particular 
resolutions of measurement, and each theory has im- 
plications for the statistical analysis of ecological pat- 
terns as well as for the management and conservation 
of natural resources, including endangered species and 
biodiversity. 


Spatial and Temporal Issues in Sampling 


Biological processes occur over time spans ranging 
from microseconds to millennia and over distances 
from nanometers to continents. However, most eco- 
logical processes occur over a more limited range of 
spatial areas and temporal durations that depend on 
the specific processes and organisms involved. As a 
consequence, understanding the dynamics and regula- 
tion of a particular process (or phenomenon such as 
species distribution) requires measuring the process 
rate, the results of the process, and the factors that in- 
fluence the process at appropriate spatial and tempo- 
ral scales. 

- The concept of “scale” has long been recognized as 
an important issue in ecology. Nonetheless, much of 
the current confusion about the relationships between 
patterns and processes in ecology (e.g., local versus re- 
gional control of species diversity) stems from a failure 
to collect, analyze, and interpret data at the appropri- 
ate scale for the process of interest (cf. Cornell and 
Lawton 1992; Huston 1999). Although the term 
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*scale" is often used loosely as a general concept re- 
lated to the size of something, the technical use of the 
term in ecology has become muddled as ecologists 
have attempted to address phenomena that occur over 
large areas (e.g., “landscapes”) and at different “levels 
of organization" (e.g., individual organisms versus 
populations, cf. Morrison and Hall, Chapter 2). 

Landscape ecology has extended the definition of 
scale to include the total size of the area being consid- 
ered independent of the resolution at which an area 
(or time period) is measured or represented (Turner et 
al. 19892), thus creating a definition that has two in- 
dependent and unrelated elements (“grain,” defined as 
resolution, and “extent,” defined as total size). The 
confusion caused by this expanded definition is mag- 
nified by the fact that geographer/cartographers have 
always referred to scale specifically in terms of resolu- 
tion, so that a map at the scale of 1 foot to 24,000 feet 
is a larger-scale map (higher resolution) than a map at 
the scale of 1 foot to 100,000 feet. Ecologists, by in- 
corporating the size of the total area (or *extent") into 
the concept of *scale," have reversed this definition by 
referring to the geographer's small-scale map as *large 
scale" because it covers a larger total area than the 
high-resolution map. 

Historically, the limitations of storage media cre- 
ated an inverse relationship between spatial resolution 
and the size of the area that could be represented on a 
single unit of the storage medium (i.e., the piece of 
paper on which a map is printed). However, with the 
development of electronic digital storage devices, this 
inverse correlation has been greatly relaxed to the ex- 
tent that the resolution and the size of the area (or 
time period) represented are essentially independent 
(e.g., Franklin 1995; Scott et al. 1993). Combining 
two independent properties (resolution and total size) 
in the single word *scale" has created a term that has 
no precise definition. Because the common ecological 
usage of “scale” has no clear meaning, it cannot be 


» cç 


used comparatively (e.g., “large scale,” “small scale"). 
It is best treated as a generic term for all of the issues 
related to the size (area or duration) of samples, phe- 
nomena, or processes (Peterson and Parker 1998a; 
Csillag et al. 2000). Two of these issues are obviously 
sample resolution (the cartographic definition of 


scale), and the total size of the set from which samples 


are taken (the statistical “population” of space, time, 
or organisms). 

Sample resolution includes two distinct concepts. 
One is the size of a single sample, which can be an 
area of space, a period of time, or a group of organ- 
isms or objects, and is the minimum unit between 
which differences can be resolved. The size of the unit 
sample used in species occurrence modeling varies 
over several orders of magnitude from field plots or 
quadrats to pixels of satellite imagery used to charac- 
terize conditions over large areas. The size of the unit 
sample influences the value of parameters that are as- 
sociated with each unit area. Specifically, information 
on heterogeneity within the unit sample is lost when a 
single value is used to characterize a specific property 
(e.g., vegetation type) over a large area. This in- 
evitably results in the loss of information on rare con- 
ditions (Trani, Chapter 11; Henebry and Merchant, 
Chapter 23) and a consequent degradation of predic- 
tive ability (Boone 1997; Thomas et al., Chapter 10; 
Debinski et al., Chapter 44; Gonzalez-Rebeles et al., 
Chapter 57; Hunsaker et al., Chapter 61; Young 
and Hutto, Chapter 8). In general, higher spatial 
resolution leads to better predictive models, assuming 
data quality and size of the sample region are not 
compromised. 

The second concept related to sample resolution is 
measurement sensitivity, or the ability to distinguish 
differences between the measured values of a property 
of two different sampling units. Measurement sensitiv- 
ity can be a function of the detection limits of a chem- 
ical analyzer or the precision of a field measurement. 
Variation in measurement sensitivity produces data 
resolutions that range from continuous quantitative 
values, through ordered rankings, to presence/absence 
information, each of which requires different analyti- 
cal methods (Guisan, Chapter 25). 

In addition to the spatial and temporal resolution, 
the size of the overall area from which samples are 
collected (the sample region or sample period) is also 
important. The sample region (also called “extent,” 
see Morrison and Hall, Chapter 2) is a size issue that - 
is independent of the size of the sampling unit (i.e., the 
resolution, or *grain"). As with resolution, the size of 
the total area that should be sampled depends on 
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the processes of interest and the questions being 
addressed. 

The size of the sample region (or sample period) af- 
fects two distinct properties that are critical for pre- 
dicting ecological patterns such as species occurrence. 
The first is the total amount of whatever is contained 
in the area. Obviously, a larger area will have more of 
everything (e.g., forage, prey animals, nesting sites) 
than a small area with identical properties. This prop- 
erty is most important for issues related to population 
size and behavior. Understanding the factors that de- 
termine how a wolf pack uses its range requires a 
study area at least as large as the range and fully con- 
taining at least one complete range. However, there is 
no constant size that will always meet this objective, 
since the quality of “habitat” varies from one location 
to another and from one year or decade to another. In 
productive environments, a much smaller range size is 
needed to support a predator, or a viable population 
of predators, than in an unproductive environment, 
although the complexities added by animal territorial- 
ity may complicate the expected pattern (see Small- 
wood, Chapter 6). 

The second property of size is the environmental 
variability contained within a sample region (or en- 
countered over time). Again, all things being equal 
(which they rarely are), large areas will generally in- 
clude a greater range of environmental conditions 
(i.e., greater heterogeneity or variability), than a 
smaller area. Of course, most environmental variabil- 
ity results from the interaction of geology, topography, 
and weather, with the obvious consequence that an 
area in a mountainous region will have a much greater 
range of conditions than an area of the same size in 
the plains. 

The range of conditions sampled in a study (assum- 
ing an appropriate resolution and sample density) will 
determine the accuracy, precision, and generality of 
the resulting predictions (see Austin, Chapter 5). Stud- 
ies that include only a narrow range of environmental 
conditions (regardless of the total size of the sample 
region or study area) are likely to have very low preci- 
sion, because the unpredictable (random) effects of 
dispersal, disease, or other factors that are not corre- 
lated with environmental conditions will be much 
greater than any predictable responses to environmen- 


tal conditions. The responses of an organism to the 
environment (i.e., whether it is a generalist or special- 
ist) also affect the range of conditions available for 
prediction. The habitat use pattern (i.e., specialization 
or "niche width") of a species within the study area 
has a strong effect on the accuracy properties of the 
predictive model that can be developed (Fielding, 
Chapter 21; Johnson et al., Chapter 12; Garrison and 
Lupo, Chapter 30; Hepinstall et al., Chapter 53; 
Dettmers et al., Chapter 54). 

Increasing the number of samples in a region with 
low variability will increase the accuracy of the predic- 
tion, but the precision will still be low (i.e., a low pro- 
portion of variance will be explained). On the other 
hand, studies that include extremely broad ranges of 
conditions, most of which are unsuitable for the or- 
ganism or property of interest, will produce high accu- 
racy and precision over the sample region but will be 
of little use for predicting variation in population den- 
sity, or even where the species is most likely to occur, 
within the range of conditions it can survive (see Elith 
and Burgman, Chapter 24). Models of this type may 
be of little or no value outside the study area, where 
the range of conditions is different or narrower (e.g., 
in potentially suitable habitat) because they are not re- 
lated to the processes that determine species abun- 
dance and occurrence. 

A third property forms the linkage between sam- 
pling resolution and the size of the sample region: 
sample density, or the number of measurements at a 
specific resolution per unit area. Satellite images, pho- 
tographs, phonographs, and maps have a sample den- 
sity of 1.0, which is to say that the entire area (or time 
period) is sampled at the maximum resolution of the 
method. However, ecological samples rarely have a 
density of 1.0, because of the time and effort required 
to collect quantitative environmental data. Typically, 
only a few percent of a total area (e.g., 1 hectare) are 
actually sampled with plots of a given resolution (e.g., 
one hundred 1-square-meter plots is 1 percent of 1 
hectare). Adequate sampling involves not only sam- 
pling at the appropriate resolution (time interval or 
plot size) for the process of interest, but also having a 
high enough density of samples to resolve the spatial 
or temporal patterns in sample values over the focal 
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area (or time period) (Austin, Chapter 5; Karl et al., 
Chapter 51). 

The accuracy of predictions of species distributions, 
or of any ecological phenomenon, depends on 
whether scale issues are handled properly. The critical 
issue is not the specific resolution or total size of a 
study, but whether the resolution and size are appro- 
priate for the phenomena being studied and the hy- 
potheses to be addressed. The single most important 
issue with regard to accuracy of prediction is whether 
the data are collected at a resolution and density ap- 
propriate to identify and characterize the processes 
that determine the pattern of interest (e.g., species 
distribution). 

Understanding the relationship between a specific 
pattern and the processes that produce it is essential 
for accurate prediction (see Van Horne, Chapter 4; 
Smallwood, Chapter 6). Perhaps the largest impedi- 
ment to accurate prediction is the fact that many pat- 
terns can be detected at resolutions far coarser than 
the resolution needed to understand the processes that 
produce the pattern, which is also the resolution 
needed to predict the pattern. Presence/absence of a 
species is a phenomenon that produces patterns that 
can be represented at very low spatial resolution. 
However, the processes that result in presence/absence 
are the population dynamics that influence population 
growth rates, dispersal and colonization, and ulti- 
mately extinction or survival. A species that is present 
in only a small portion of a large area would be 
counted as present in the large area. However, the av- 
erage or median environmental conditions for the 
large area are likely to be very different than the con- 
ditions of the small subset of the area where the 
species occurs. Thus, predictions of species occurrence 
based on properties of the large area (i.e., low spatial 
resolution measurements) are unlikely to produce ac- 
curate predictions of where the species actually occurs 
or to lead to an understanding of the processes that 
lead to the low-resolution pattern. 

For example, mismatches between the size of the 
area in which species diversity is measured and the 
size of the area over which ecological processes that 
influence diversity actually operate has led to an over- 

estimation of the influence of “large-scale processes" 
on patterns of species diversity (Huston 1999). The 


relative importance of local processes (e.g., competi- 
tion) versus regional processes (e.g., evolutionary his- 
tory) in controlling local species diversity is an issue of 
both theoretical and practical interest (Ricklefs 1987; 
Ricklefs and Schluter 1993). The critical issue is 
whether (1) local diversity *saturates" at some level 
that is independent of the total number of species in 
the region around the locality (implying local diversity 
is controlled by local processes), or (2) local diversity 
continues to increase as regional diversity increases 
(implying local diversity is controlled by regional 
processes). 

A recent study that compiled data on the species 
richness of different types of organisms on different 
continents over spatial areas ranging from “local” to 
“regional” in size found that local species richness was 
linearly correlated with regional species richness 
(Caley and Schluter 1997). This lack of evidence for 
local *saturation" of species richness supported the 
conclusion that local diversity is controlled primarily 
by regional processes and that local processes, such as 
competition, are relatively unimportant. However, a 
reevaluation of these results revealed a mismatch be- 
tween the resolution of species richness data and the 
distance over which local ecological processes operate 
(Huston 1999). The size of the area designated as the 
“region” was 500x500 kilometers, which is reason- 
able. However, the size of the “local” area was 1 per- 
cent of this area, or an area of 50x50 kilometers, 
which is vastly larger than the area over which local 
interactions occur between organisms. Furthermore, 
the physical heterogeneity included within an area of 
2,500 square kilometers is generally large enough that 
many different types of environments are represented, 
corresponding to many different “habitat types.” 
Since organisms that occupy different habitat types 
rarely interact, any sample that includes multiple habi- 
tats cannot reflect limitation of species richness by 
local interactions, since multiple localities with species 
that do not interact are aggregated (cf. Cornell and 
Karlson 1996; Karlson and Cornell in press; Huston 
1999). 

When species richness is evaluated in areas small . 
enough for local interactions (e.g., competition, facili- 
tation, predation) to occur, the saturating effect of 
local interactions is clearly seen in a specific range of 
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environmental conditions predicted by nonequilibrium 
theory, but not under all conditions (Huston 1999; 
Lord and Lee 2001; Karlson and Cornell in press). 
The effects of specific ecological processes can only be 
detected if ecological patterns are measured at the ap- 
propriate scale of resolution. 

Unless ecological patterns, and the environmental 
conditions that influence the processes that determine 
the patterns, are quantified appropriately, species oc- 
currences, patterns of species distributions, and varia- 
tion in population viability cannot be understood or 
predicted. The issues of sample resolution, sample 
density, and size (or duration) of a study must be ade- 
quately addressed before it is possible to quantify and 
understand the complex interactions of ecological 
processes. 


Ecological Processes and 
Theoretical Ecology 


Our understanding of ecological processes is ex- 
pressed in the body of hypotheses that can be called 
ecological theory. Although much of theoretical ecol- 
ogy seems to have little practical value (cf. Peters 
1991; Sarkar 1996; Suter 1981), theory potentially 
has value far beyond the satisfaction of intellectual cu- 
riosity. Hypotheses that are sufficiently robust to 
avoid definitive falsification not only provide a state- 
ment of our understanding of particular ecological 
processes, but also improve predictions by identifying 
the critical independent variables and describing their 
effect on ecological processes and properties. In addi- 
tion to refining predictions of known phenomena, 
strong theories can identify phenomena or patterns 
not previously known to occur. Such patterns may be 
discovered only because the theory predicted that they 
should occur. Discovery of such previously unreported 
phenomena clearly represents strong support for a 
theory, which would otherwise be considered falsified 
by the fact that the predicted phenomena were not 
known to occur. 
Although there are many subdisciplines of ecology 
with distinct sets of theories, such as behavioral ecol- 
ogy or aquatic ecology, the primary concern of this 
book is that area of ecology that addresses issues of 


community structure, including species distributions 


and species diversity. Community ecology is a broad 
subdiscipline spanning population ecology and ecosys- 
tem ecology and including processes operating over a 
range of dimensions from molecular to continental. 
Although there are many theories about various as- 
pects of community ecology, it lacks a rigorously 
tested and widely accepted theoretical framework. 
Lack of such a framework has been a major impedi- 
ment to sound resource management and conservation 
planning. 

Just as the individual organism has long been rec- 
ognized as the fundamental unit of natural selection, 
the individual organism with its behavior and physiol- 
ogy shaped by natural selection is also a fundamental 
unit of ecological processes. Over the past decade, 
there has been increasing recognition that the individ- 
ual organism can serve as a fundamental unit that al- 
lows integration of processes across dimensions rang- 
ing from molecular and genetic to ecosystems, 
landscapes, and even the globe (Huston et al. 1988; 
DeAngelis and Gross 1992; Huston 1994; Moorcroft 
et al. in press; Roloff and Haufler, Chapter 60). In 


most ecological re observations and measure- 
bservations and measur 


ments of individual organisms are the basic units of 
organizational levels such as populations, communi- 
ties, ecosystems, landscapes, and biomes. 

From the perspectives of both data analysis and 
computer modeling, aggregation from individual or- 
ganisms to populations, communities, and ecosystems 
is a straightforward process (Huston et al. 1988). Eco- 
logical theories and models are now being developed 
using the individual organism as the basis for under- 
standing higher-level phenomena (e.g., Smith and 
Huston 1989; Huston 1994; Moorcroft et al. in press; 
Shapiro et al., Chapter 49; Raphael and Holthausen, 
Chapter 62; Hartless et al., Chapter 39; Gross and 
DeAngelis, Chapter 40; Roloff and Haufler, Chapter 
60). 

The critical requirements for the success of this in- 
dividual-based approach are (1) a thorough, quantita- 
tive understanding of how individuals of particular 
species respond to the range of environmental condi- 
tions they may experience and (2) a quantitative de- 
scription of the relevant environmental conditions in 
the specific areas over which we desire to understand 
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and predict ecological patterns at spatial and temporal 
resolutions appropriate for the organisms of interest. 
Our increasing ability to describe the physical envi- 
ronment at high spatial and temporal resolutions 
using satellite-based sensors and GIS (geographic in- 
formation system) technology, as well the ever-increas- 
ing knowledge base of the physiology and behavior of 
organisms, make this mechanistic approach to ecology 
feasible (Scott et al. 1993; Franklin 1995; Trani, 
Chapter 11). Many of the advances in our ability to 
predict wildlife habitat quality and species distribu- 
tions are the direct consequence of higher spatial and 
functional resolution (e.g., vegetation type and struc- 
ture, Young and Hutto, Chapter 8; DeAngelis et al. 
1998). Explaining and predicting patterns that can be 
detected at a coarse resolution (e.g., species occur- 
rence) generally requires information on processes and 
environmental conditions that must be collected at a 
|j much finer resolution. The importance of basing pre- 
dictions on processes rather than on pattern correla- 
tions, and of measuring both patterns and processes at 
sufficiently high levels of spatial resolution, are themes 
that recur in many of the chapters in this volume. 
The following three examples illustrate how differ- 
ent types of theories, all of which have the individual 
organism as the fundamental unit of process and 
measurement, can contribute to understanding ecolog- 
ical processes, to identifying the appropriate resolu- 
tions at which patterns are predicted to appear, and to 
suggesting statistical methods for quantifying those 
patterns. In each example, failure to understand the 
regulation of ecological processes leads to either (1) 
underestimation of the significance and miscalculation 
of the shape and magnitude of a causal relationship 
between environmental conditions and ecological 
processes or patterns, or (2) complete failure to detect 
existing causal relationships between environmental 
conditions and ecological processes or patterns. 


Interactions of Limiting Resources: 
Liebig's Law of the Minimum 


In 1840, J. Liebig published the seminal observation 
that even if most of the chemical elements needed by 
plants are abundant, plant growth can still be limited 
to low levels if a single critical element is in short sup- 


ply. Much of the success of modern agriculture is built 
upon this insight, particularly for crops on the highly 
weathered soils of the humid tropics and arid Aus- 
tralia, where even micronutrients such as molybde- 
num or copper can limit plant growth (Jones and El- 
liott 1944; Hingston et al. 1980; McArthur 1991; 
Burvill 1979). Agronomists have long recognized that 
there are other types of interactions between nutrients, 
such as complementarity, that must be considered in 
crop management and yield prediction (Heady et al. 
1955). Nonetheless, the generality of the law of the 
minimum has held up for over a century and a half 
and provides a conceptual framework that is useful 
for a wide range of ecological processes in addition to 
plant growth. 

Virtually every ecological process is influenced by 
multiple factors, some of which are resources that can 
be depleted by organisms and others of which are reg- 
ulators, such as temperature, that organisms cannot af- 
fect (Huston 1994). Although any resource or regula- 
tor can potentially occur at levels that limit the rate of 
a process, it is unlikely that the same single factor will 
always be limiting. Spatial and temporal variation in 
resources and regulators results in a continual shifting 
of limitation from one factor to another. For example, 
in seasonal climates, water is likely to limit plant 
growth during the dry season regardless of the levels of 
soil nitrogen or phosphorus, but during the wet season 
it is probable that nitrogen, phosphorus, light, or some 
factor other than water will limit plant growth. 

This shifting between limiting factors has unfortu- 
nate consequences for understanding and predicting 
ecological processes, including processes as fundamen- 
tal as plant growth. When processes are regulated ac- 
cording to the law of the minimum, variation in a 
nonlimiting factor has little or no effect on the rate of 
the process of interest. For example, measurements of 
the growth response of plants will have little correla- 
tion with variation in soil nitrogen, one of the most 
important potentially limiting plant resources, if some 
other resource, such as water, is currently at limiting 
levels. This lack of any correlation between plant 
growth and nitrogen obviously does not mean that ni-. 
trogen is unimportant for plant growth or is not a 
major regulator of plant growth rates but simply that 
nitrogen is not limiting under some or all of the condi- 
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tions when the measurements were made. Process 
rates will be correlated only with variation in the lim- 
iting resource, and only under these conditions will re- 
gression analysis reflect the potential effect of the lim- 
iting resource on the process. Consequently, the 
"true" or maximum potential response to any specific 
resource or regulator of a process can only be quanti- 
fied when all other resources or regulators occur at 
nonlimiting levels. 


Unfortunately, most ecological research under field 
or experimental conditions measures only a subset of 
the factors that potentially limit the process under in- 
vestigation. Ecological studies typically include a mix- 
ture of measurements made under both limiting and 
nonlimiting.conditions and thus produce datasets with 
high variance (and low correlation coefficients) that 
hinder or prevent detection of the actual response of a 
process to a specific factor (cf. Huston 1997). 
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Figure 1.1. Interactive effect of one through four limiting resources on an ecological response regulated according 
to Liebig's Law of the Minimum. (a) Response of a hypothetical measured response to random variation in Resource 
1, when Resource 1 is the only limiting factor. The only variance is added random error. (b) Observed response to 
Resource 1, with random variation in one additional limiting resource. The solid regression line is the predicted re- 
sponse using only Resource 1. The dotted lines in b, c, and d indicate the upper bound of data, which is a close ap- 
proximation of the "true" response of (a). The lower regression equation is based on precise measurement of both 
resources (x and z), with multiplicative interaction term. (c) Observed response to Resource 1, with random variation 
in two additional limiting resources. The solid regression line is the predicted response using only Resource 1. The 
lower regression equation is based on precise measurement of all three resources (x, z, and w), with multiplicative 
interaction terms. (d) Observed response to Resource 1, with random variation in three additional limiting re- 
sources. The solid regression line is the predicted response using only Resource 1. The lower regression equation 
is based on precise measurement of all four resources (x, z, w, and q) with multiplicative two-way interaction terms. 
Note that (1) the departure of the statistical relationship of the response to Resource 1 from the actual response 
(a) increases with additional limiting factors (b, c, d); (2) the increasing variance of the measured response at in- 
creasing levels of Resource 1 with one or more additional limiting factors (i.e., variance amplification); and (3) that 
the same phenomenon occurs with complex nonlinear relationships as with the linear relationship illustrated here. 
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It is usually impossible to determine whether the 
measured factors are actually limiting at the time the 
measurements are made. If the measured factor is the 
only limiting condition, there is likely to be a strong 
correlation between the process rate and the level of 
the factor, providing a good estimator (e.g., regres- 
sion) of the effect of the factor on the process (Fig. 
I.1a). However, if some of the measurements are made 
at times or locations where the measured factor is not 
the limiting condition, there should be little or no cor- 
relation and no predictive relationship between the 
measured factor and the process. As additional factors 
become limiting at the places or times that measure- 
ments of the response are made, the correlation be- 
tween the response and the hypothesized driving vari- 
able becomes rapidly weaker (Figs. I.1b,c,d). If all of 
the periodically limiting factors are identified and 
measured, more of the response variability can be sta- 
tistically *explained," but the standard linear (or non- 
linear) regression still cannot describe the true rela- 
tionship between each of the limiting factors and the 
measured response. 

This problem has been receiving increasing atten- 
tion in the ecological and statistical literature (e.g., 
Thompson et al. 1996). A number of approaches to 
quantifying responses under conditions with multiple 
limiting factors have been developed, involving the use 
of “upper bound" or “envelope” statistical ap- 
proaches (e.g., Maller 1990; Kaiser et al. 1994). One 
recent contribution describes a relatively simple and 
intuitive approach to quantifying the actual response 
of an ecological process to a single limiting factor in a 
situation with multiple limitations (Cade et al. 1999). 
The *quantile regression" approach provides a stan- 
dardized method (with computer software available) 
for identifying and quantifying the upper bound of the 
cloud of points (i.e., high variance) of a typical ecolog- 
ical response (Fig. 1.2). 

Thus, the high variance that is often found in cor- 
relations between ecological processes and presumed 
causal factors may not be sampling error or random 
“noise” but rather the mechanistic consequence of 
shifts between limiting resources or the effects of other 
limiting factors such as mortality or dispersal (see 
O'Connor, introduction to Part 1). Although Liebig’s 
Law of the Minimum was originally proposed in rela- 


tion to plant growth, the same phenomenon can occur 
with any process that is regulated by more than one 
factor, which includes virtually all ecological pro- 
cesses. For example, the successful establishment and 
growth of a particular plant species may require a par- 
ticular minimum level of nitrogen above which plant 
growth and biomass during succession is positively 
correlated with nitrogen availability. However, if for 
some reason seeds of the species never reach a given 
area, or the seed supply is extremely low, then the 
abundance of the plant may be very low under favor- 
able nitrogen conditions. 

Identifying the true relationship between a species’ 
abundance and environmental conditions is compli- 
cated by the *false negatives" that result from the fail- 
ure of species to colonize all areas where it could po- 
tentially thrive (see Fertig and Reiners, Chapter 42; 
Fleishman et al., Chapter 45). Although many differ- 
ent factors can potentially limit an ecological re- 
sponse, in many cases it is the availability of the or- 
ganisms themselves that determines the presence or 
absence of a particular species and such ecological 
properties as species richness (e.g., Tilman 1997; 
Hubbell et al. 1999; Fleishman et al., Chapter 45). 

This limitation of the dispersal and spread of or- 
ganisms is an inevitable natural phenomenon that 
varies predictably in response to local environmental 
conditions and the life history and dispersal properties 
of the organisms themselves (Harrison et al. 1992; 
Nekola and White 1999). Dispersal is a process that is 
strongly influenced by chance and probability, and it is 
well known that plants and animals whose propagules 
(e.g., spores) are born passively on the wind tend to 
quickly colonize all suitable habitats. Birds obviously 
tend to reoccupy suitable habitats following local ex- 
tinction much more rapidly than less-mobile animals, 
such as turtles or frogs. As a consequence, the proba- 
bility of a species actually occupying an area of suit- 
able habitat increases with the mobility of the species 
but decreases with the frequency of climatic (or other) 
events that kill or displace organisms. 

It is in the arena of dispersal that the theory of is- 
land biogeography (MacArthur and Wilson 1967) is 
most relevant, along with such conservation biology 
issues as corridors, “matrix” properties, and the like. 
The probabilistic nature of dispersal results in high 
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Figure 1.2. Graphical representation of quantile regression 
method illustrating subdivision of data into quantiles to quan- 
tify maximum response (presumed response when only the x- 
axis factor is limiting). (A) Relationship of response (biomass) 
to independent variable (habitat factors) with effects of addi- 
tional limiting (nonhabitat) factors controlled by adjusting bio- 
mass for nonhabitat factors. (B) Relationship of biomass to 
habitat factors without controlling for nonhabitat effects (i.e., 
assuming nonhabitat factors are unknown and/or unquanti- 
fied). Note that the 95th quantile regression in B closely ap- 


proximates the “true” relationship shown by the adjusted data . 


in A. (Fig. 2 from Cade et al., 1999). 


predictability at coarse resolution over large areas in 
the distribution of organisms among suitable habitats 
but unfortunately also leads to low predictability of 
presence or absence at high resolution (i.e., for specific 
small localities). 

This phenomenon is a critical issue in the predic- 
tion of species occurrence and undoubtedly accounts 


\ 
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for a large proportion of errors of commission in 
which a species is predicted to occur in an area but is 
not observed there. In these cases, prediction of the 
suitability of a habitat to support a species (presum- 
ably based on the population responses of the species 
to relevant environmental conditions) may be correct, 
but the limiting factor of dispersal (or recent mortal- 
ity, or inadequate time for a population to build to de- 
tectable levels following mortality, etc.) prevents the 
predicted population from being actually observed 
(see Fleishman et al., Chapter 45). Methods to predict 
the expected level of errors of commission can poten- 
tially be based on estimates of disturbance frequencies 
and locations (see Guisan et al. 1999) and estimates of 
dispersal rates based on organismal properties, dis- 
tance to source populations, and geographical barriers 
or restrictions on dispersal (e.g., Nathan et al. 2001). 
Use of an upper quantile relationship (see Cade et al. 
1999) in developing predictive models of species oc- 
currence could reduce the confounding effects of dis- 
persal limitations on models of potential species abun- 
dance and distributions. Evaluation and comparison 
of the accuracy of species occurrence models are ad- 
dressed in many of the chapters in this volume (e.g., 
Hepinstall et al., Chapter 53; Johnson et al., Chapter 
12; Johnson and Sargeant, Chapter 33; Fielding, 
Chapter 21; Karl et al., Chapter 51; Pearce et al., 
Chapter 32). 

Another important consequence of the interactive 
effect of multiple limiting resources on a process is the 
amplification of variance in the measured process in 
response to increasing levels of the limiting resource 
that is measured as a driving variable. This phenome- 
non of increasing variance (the variance amplification 
hypothesis) has important implications both for the 
quantification of ecological processes (e.g., prediction 
of habitat quality) and for the possible destabilization 
of ecological interactions in response to increasing lev- 


\ els of limiting factors (e.g., temperature, nitrogen, car- 
\bon dioxide). 


| The interaction of multiple limiting factors, includ- 
jing processes that cause mortality, as well as resource 
availability and dispersal probability, produces a 
pattern in which unpredictability (i.e., variance) in- 
creases as environmental conditions become more 
“favorable,” where favorability is interpreted as 
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higher levels of one or more limiting resources (e.g., 
more fertile soils, more favorable temperature, opti- 
mal moisture conditions). Under such conditions of 
high levels of all limiting factors, biological responses 
such as plant growth or the rate of increase of animal 
populations, can be extremely high. However, when a 
single limiting factor is reduced to a low level (e.g., a 
drought), the biological response can suddenly drop to 
a very low level, leading to the impression that the bi- 
ological response is unstable and unpredictable. 

Although this type of *instability" can be very dra- 
matic, it is a completely predictable consequence of 
the variance amplification hypothesis, based on the in- 
teraction of multiple limiting factors. Such sudden 
changes in ecological responses that have been inter- 
preted as “chaos” (e.g., Tilman and Downing 1994) 
or "instability" caused by low species diversity 
(Tilman 1996) can be simply understood as the vari- 
ance amplification that results from interacting limit- 
ing resources (Huston 1997). In addition to increasing 
response variance along gradients of increasing re- 
source availability, other complex patterns can appear 
along gradients of environmental conditions, includ- 
ing complete reversal of response dynamics from one 
end of a gradient to the other. 


Single-factor Response Reversals: 
The Unimodal Diversity-productivity Curve 


Predictive models require quantification of the rela- 
tionship between the environmental property of inter- 
est (e.g., population size, species presence, species 
richness) and the environmental conditions hypothe- 
sized to influence that property. This quantification 
may be purely descriptive, as in the case of a statistical 
regression, or may be mechanistic at some level, as 
with population models or plant succession models. In 
either case, the predicted relationship is based on or 
tested against observations (at the appropriate spatial 
and temporal resolution and density) from experi- 
ments or field sampling, ideally over the full range of 
environmental conditions to which the model will be 
applied. The confounding effects of multiple limiting 
factors on the quantification of cause-and-effect rela- 
tionships (whether linear or nonlinear) were discussed 
above. Other types of problems can occur as a result 


of nonlinear relationships, which are common in eco- 
logical phenomena. 

A potentially confusing type of nonlinear relation- 
ship is the unimodal or “humpbacked” response curve 
in which the response variable first increases at low 
levels of an environmental variable and then decreases 
at higher levels. With patterns of this type, data col- 
lected at high-enough resolution along different por- 
tions of an environmental gradient can lead to statisti- 
cally significant, but opposing, conclusions about the 
relationship between an ecological response and envi- 
ronmental conditions, depending on where the data 
were collected. This response reversal is an inherent 
feature of any distribution that has a central maxi- 
mum with declining values toward higher and lower 
levels, including both the Gaussian or *normal" 
curve, and the more complex, skewed curves typical of 
species distributions along environmental gradients 
(e.g., Mueller-Dombois and Ellenberg 1974; Austin 
1980, 1999b; Austin et al. 1990; Austin and Smith 
1989). This phenomenon underlies one of the longest- 
running debates in ecology: the relationship between 
productivity and species diversity. 

Numerous studies have found that species diversity 
increases with increasing productivity, measured as net 
primary productivity of plants, secondary productivity 
of animals, or some factor presumed to be correlated 
with productivity, such as nutrients, temperature, or 
water availability (see recent reviews in Rosenzweig 
and Abramsky 1993; Huston 1994; Waide et al. 
1999). Examples are found in plants (Grime 1979; 
Huston 1979, 1980), fish (Dodson et al. 2000), and 
mammals (Abramsky and Rosenzweig 1983; Owen 
1988). However, numerous other studies have found 
that species diversity decreases with increasing pro- 
ductivity, particularly in plants, both aquatic (e.g., 
phytoplankton) and terrestrial (Al-Mufti et al. 1977; 
Huston 1979, 1980, 1994; Keddy 1989; Keddy et al. 
1997; Tilman 1987, 1993; Abramsky and Rosenzweig 
1983). There are numerous theoretical reasons, sup- 
ported by a fair amount of data, to believe that the 
maximum diversity of plants and animals are likely to 
occur under different environmental conditions (Hus-. 
ton 1994). This is more likely to be true for generalist 
species at the herbivore trophic level and above and 
less likely to be true for specialists on plants, whose 
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diversity is likely to be correlated with plant diversity 
(Huston 1994; Huston and Gilbert 1996). 

Focusing on plants alone, it is evident that the in- 
creases of diversity with productivity tend to occur 
under very unproductive conditions, where any im- 
provement in conditions allows more species to sur- 
vive. Most of the decreases in diversity are found at the 
more productive end of an environmental gradient, 
where competitive interactions among rapidly growing 
plants lead to the elimination of less-competitive (gen- 
erally smaller or more slowly growing) species (Grime 
1973a, 1979; Huston 1979, 1980; Bakker 1989; 
Reader and Best 1989; Berendse 1994). Together these 
patterns make a unimodal response curve, which was 
first described by Grime (1973a,b, 1979) and later doc- 
umented in marine, aquatic, and terrestrial ecosystems 
around the world (Huston 1979, 1980, 1985, 1994; 
Keddy 1989; Dodson et al. 2000; Rosenzweig and 
Abramsky 1993; Grace 1999; Waide et al. 1999). Al- 
though productivity (plant growth rate) influences the 
rate at which competitive differences are expressed in 
the absence of mortality-causing disturbances, produc- 
tivity itself is a complex response to a variety of envi- 
ronmental conditions (including temperature, light, 
water, and mineral nutrients) each of which can vary 
independently over environmental gradients (as well as 
over time) and each of which can have independent ef- 
fects on the growth and survival of plants. 

Ecological studies that cover a large portion of an 
appropriate environmental gradient (e.g., plant pro- 
ductivity or some factor such as a soil nutrient that is 
strongly correlated with productivity) are likely to re- 
veal unimodal abundance distributions of individual 
species, as well as a unimodal diversity pattern (e.g., 
Guo and Berry 1998; Grace 1999). However, the three 
different conclusions about diversity responses that 
can be drawn from statistically significant (or insignif- 
icant) correlations along different portions of a pro- 
ductivity gradient are wrong when applied to the en- 
tire gradient: (1) diversity increases monotonically 
with productivity, (2) diversity is uncorrelated with 
productivity, and (3) diversity is negatively correlated 
with productivity. 

Further confusion about the relationship between 
species diversity and productivity (or other environ- 
mental factors) results from failure to measure diver- 


sity at resolutions appropriate to ecological processes 
and environmental variability (Cornell and Lawton 
1992; Huston 1999), and from failure to consider the 
limits imposed by regional species pools on the range 
in diversity that is potentially detectable (e.g., with 
only three species in a region, the range in diversity is 
0-3, which is less likely to reveal statistically signifi- 
cant variation in diversity than a situation in which 
the potential range in diversity is 0-20 [cf. Huston 
1999; Lord and Lee 2001)). 

Much of the confusion about the unimodal diver- 
sity-productivity relationship has resulted from incom- 
plete sampling of the productivity gradient and sam- 
pling at incompatible resolutions for linking patterns 
to processes and environmental conditions (e.g., 
Waide et al. 1999). However, in addition to species di- 
versity, unimodal responses of other ecological proper- 
ties are quite common and can lead to the same types 
of confusion and erroneous conclusions about func- 
tional relationships that occur with diversity and pro- 
ductivity. Examples include the abundance of species 
along an environmental gradient. Along any gradient 
associated with the same processes that produce a uni- 
modal pattern of species richness, it is inevitable that 
most species will increase in abundance at low values 
of the environmental factor and then decrease in 
abundance (relative and probably absolute) at high 
values of the factor. 

This raises the question, “why is a species rare or 
absent in what is apparently optimal habitat within 
dispersal distance of known populations?” There are a 
number of potential explanations for this phenome- 
non, all of them linked to limiting factors other than 
the physical environment. A decline in the abundance 
of many species under favorable abiotic conditions is 
most likely to result from an increasing frequency and 
intensity of biotic interactions. In the case of plants, 
increased intensity of competition for light under pro- 
ductive stable conditions often eliminates smaller, 
shade-intolerant species and leads to a reduction in 
species diversity (e.g., Guisan et al. 1998; Grace 1999; 
Reader and Best 1989; Berendse 1994). 

Other interactions may also negatively affect 
smaller or less-abundant plant species, including in- 
creased densities of pathogens or herbivores (Connell 
1978; Hubbell 1980; Augspurger 1983; Clark and 
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Clark 1984; Nathan et al. 1999). Similar interactions 
may lead to the reduction or elimination of animal 
species, including direct competition (such as aggres- 
sion or interference competition, Brown 1973) as well 
as higher predation pressure caused by the higher 
predator densities supported by other prey species 
(i.e., "indirect interactions," Holt 1984). These phe- 
nomena emphasize the importance of distinguishing 
between the physiological (potential) niche and the 
ecological (actual) niche of organisms. This refinement 
of the *niche concept" makes the distinction between 
the physiological limitations to species distributions 
and the biological limitations, and it demonstrates the 
complex distribution patterns that can result from 
these interactions (Ellenberg 1956; Mueller-Dombois 
and Ellenberg 1974; Walter 1964/1968, 1970; Austin 
and Smith 1989; Austin, Chapter 5). 

Even an appropriate statistical description of a uni- 
modal distribution of species abundance along an envi- 
ronmental gradient may not lead to better predictions 
of the spatial distribution of the species. The problem 
is that the observed abundance distribution of a species 
represents its "ecological niche, *which may not in- 
clude the full range of conditions under which the 
species could potentially be found, and, in particular, 
may not include the physical conditions where the 
species actually grows best (in the absence of negative 
biotic interactions such as competition). The observed 
distribution of a species may differ from its potential 
distribution (or *physiological niche"), if the species is 
excluded from the conditions where it grows best by 
competition or other negative effects of other species. 
Depending on the pattern of overlap of the physiologi- 
cal responses of potential competitors, the realized 
niche of a species that is a poor competitor under opti- 
mal environmental conditions may be displaced into a 
skewed or even a bimodal distribution along the envi- 
ronmental gradient. These alternative (i.e., non-Gau- 
sian) patterns of species responses along environmental 
gradients were first described by Ellenberg (1956) and 
Walter (1964/1968) and conceptually developed in the 
context of continuum theory by Austin (1980; Austin 
et al. 1990) and Austin and Smith (1989). 

Two strikingly different patterns of fundamental 
niches (physiological responses) can be identified. 
Along resource gradients, such as soil nitrogen or 


other consumable resources, most species tend to have 
overlapping physiological optima at the high end of 
the gradient. However, for other environmental fac- 
tors that are not affected or used by organisms, such 
as temperature, no single level is superior and species 
tend to have dissimilar physiological optima distrib- 
uted along the gradient. In the case of resource gradi- 
ents, the actual conditions in which species are found 
(the *ecological niche") is often displaced toward 
lower resource levels as a result of the dominance of 
better competitors under the most favorable resource 
conditions. For regulator (e.g., temperature) gradients, 
the ecological niches of poorer competitors can be dis- 
placed in either direction or may be split into a bi- 
modal distribution (Austin 1980; Austin and Smith 
1989). There are very few cases in which symmetrical 
bell-shaped species distributions are found in nature 
(Austin 1980, 1999b, Chapter 5). 

Although the complex patterns of species distribu- 
tions in relation to environmental conditions (ecologi- 
cal niches) that are found in nature seem to make the 
quantitative prediction of species occurrence even 
more difficult, a focus on the fundamental (or physio- 
logical) response of species helps resolve this difficulty. 
Understanding the physiological optima of a rare 
species provides critical information about the envi- 
ronmental conditions (e.g., soil nutrients and mois- 
ture, temperature, vegetation structure, etc.), where it 
could potentially occur, even if it is rarely found under 
those conditions. This information is critical for 
restoration, conservation, and management of rare 
species because these are the conditions in which, with 
appropriate management, the species actually does 
best. Management, such as control of predators or 
restoration of a different (and possibly more natural) 
disturbance regime may allow some species to thrive 
in areas where they are rarely found under present 
conditions (Abbott 2000; Risbey et al. 41999); 

Differences in mobility and resource specialization 
among organisms have the consequence that distribu- 
tion patterns of fundamental and realized responses 
along resource gradients tend to be very different for 
plants versus animals. Most plants have their greatest. 
survival, growth, and fecundity under similar favor- 
able conditions of nutrients, water, and light. How- 
ever, animals have a much greater range of opportuni- 
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ties for specialization in resource use, and thus even 
potential competitors are likely to have different op- 
tima in their fundamental responses (Huston 1994). 
The complexity of ecological responses along 
"single-factor" gradients results from the fact that dif- 
ferent ecological processes predominate on different 
parts of the gradient. This shift in processes is most 
conspicuous along resource gradients and is unlikely 
to occur as consistently along regulator gradients. At 
very low resource levels, most species grow poorly, if 
they can survive at all. With increasing levels of the 
limiting resource, more species can survive, and all 
species grow better. Further increases in the limiting 
resource potentially allow all species to grow even bet- 
ter, except that a few of the species that grow best typ- 
ically dominate and eliminate other species that are 
poorer competitors for some other resource (often 
light in the case of plant competition). Thus, the dom- 
inant process along the gradient shifts from physiolog- 
ical tolerance to low resource levels at the low end of 
a particular resource gradient, to competition at the 
high end (Grime 1979). Addition of a second factor 
along an environmental gradient adds further com- 
plexity to ecological responses as illustrated below by 
the interaction between disturbance and productivity. 


Multifactor Response Reversals: 
Complex Effects of Mortality on Diversity 


The effect of periodic mortality of organisms on eco- 
logical processes and patterns has long been recog- 
nized (Andrewartha and Birch 1954; Loucks 1970; 
Connell 1978; White 1979). However, the use of mor- 
tality-causing disturbances, such as clearcuts or fires, 
as environmental management tools has only recently 
become common practice (Wright and Bailey 1982; 
Romme and Despain 1989; White et al. 1991) and re- 
mains politically controversial in many regions. 

The intermediate disturbance hypothesis, attributed 
to Connell (1978; see also Fox 1979), summarizes the 
general observation (Paine 1966; Loucks 1970; 
Lubchenco 1978; Sousa 1979; White 1979) that high 
levels of species diversity are often found in situations 
with a moderate amount of mortality, while both low 
and high levels (intensity and/or frequency) of mortal- 
ity tend to be associated with lower levels of diversity. 


The processes that produce this unimodal diversity 
pattern are analogous to those that produce the uni- 
modal diversity-productivity relationship but are re- 
versed along the primary axis. Thus, high levels of 
mortality create conditions in which few species are 
able to survive, analogous to the effects of extremely 
low productivity, while low levels of mortality allow 
competitive exclusion to occur and reduce species di- 
versity, analogous to the effects of high productivity. 

In spite of the logical soundness and convincing an- 
ecdotes that support the intermediate disturbance hy- 
pothesis, many systematic efforts to test it have found 
that it does not work very well. Two recent reviews 
(Feminella and Hawkins 1995; Steinman 1996) on the 
effects of grazing (a type of mortality-causing distur- 
bance) in aquatic systems concluded that the effects of 
grazing on species diversity of algae were not pre- 
dictable. Some studies found that grazing increased 
species diversity while other studies found that it de- 
creased diversity, but there was no consistent pattern. 
This situation provides an example of a fundamentally 
important ecological pattern that was not detected 
until it was predicted by a theory and the data were 
reanalyzed appropriately. 

The interactive effects of mortality and productivity 
on species diversity were first predicted by the dy- 
namic equilibrium model (Huston 1979) and summa- 
rized as a three-dimensional response surface of diver- 
sity in relation to orthogonal axes of mortality 
frequency versus the rate of population growth and 
competitive displacement (both of which are often 
correlated with productivity and the levels of limiting 
resources). This model predicted that the single-factor 
conditions (i.e., either mortality level or population 
growth rate) under which highest diversity should 
occur would shift, depending on the level of the other 
factor (i.e., population growth rate or mortality level). 
The consequence of this interaction was that the same 
quantitative increase in disturbance level could either 
increase species diversity (under high-productivity 
conditions) or decrease species diversity (under low- 
productivity conditions), with analogous responses to 
a similar change in productivity under different distur- 
bance regimes (Fig. I.3). 

Unfortunately, the predictions of the dynamic equi- 
librium model have never been systematically tested in 
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field experiments, largely because of the size and com- 
plexity of experiments involving the factorial interac- 
tion of enough treatment levels to detect nonlinear 
(and potentially unimodal) responses. A number of 
small experiments and studies have produced results 
consistent with the predictions of the model (Huston 
1994). In spite of the implications of these predictions 
for conservation and resource management, there was 
no systematic effort to evaluate the model until a re- 
cent effort involving “meta-analysis” (Osenberg et al. 
1999) of published experiments on the effect of graz- 
ing on species diversity (Proulx et al. 1996; Proulx and 
Mazumder 1998). 

Grazing fits within the definition of a mortality- 
causing disturbance because it results in the mortality 
of all (in the case of grazing of phytoplankton) or part 
(in the case of most terrestrial grazers) of an organism 
and thus affects both survival and competitive interac- 
tions. Lawnmowers act as mortality-causing distur- 
bances for herbaceous plants, killing a variable pro- 
portion of each plant and affecting both survival and 
competitive interactions, as do logging, ice storms, 
windstorms, fires, and other natural and anthro- 
pogenic disturbances. 

The relative ease of experimental manipulation of 
plant grazers in aquatic and terrestrial systems has 
led to a large number of published grazing experi- 
ments. These studies were reviewed by Proulx and 
Mazumder (1998) for their meta-analysis, which 
clearly showed the reversal of grazing effects on 
species diversity (plant species richness) between pro- 
ductive and unproductive environments predicted by 
the dynamic equilibrium model (Fig. I.3). This analy- 
sis found that most of the published experiments in 
systems that could be classified as productive (based 
on measured nutrient levels, precipitation, or quali- 
tative assessments such as oligotrophic versus 
mesotrophic) showed an increase in plant diversity in 
response to grazing. In contrast, all of the studies 
conducted in nutrient-poor, unproductive systems 
found that plant diversity decreased with increased 
intensity of grazing. 

The consistency of grazing effects on plant diversity 
in terrestrial, aquatic, and marine systems suggests that 
the predictions of the dynamic equilibrium model 
(Huston 1979) are applicable to a wide range of 
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Figure 1.3. Predictions of the dynamic equilibrium model of 
species diversity. Both disturbance type and frequency, and 
growth rates (plant productivity, population growth rates) vary 
across landscapes in response to geology, topography, and cli- 
mate. The effect on species diversity of a change in either dis- 
turbance frequency or population growth rates can be reversed 
from one region to another, depending on local conditions of 
growth rates and disturbance frequency, respectively. Note that 
the response of diversity to a given change in disturbance fre- 
quency is predicted to reverse between environments with low 
growth rates (a) and environments with high growth rates (c), | 
and that the response of diversity to a given change in produc- 
tivity (growth rate) is predicted to reverse between environ- 
ments with a high disturbance rate (d) and environments with a 
low disturbance rate (f) (from Huston 1979, 1994). 
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ecosystems with different types of mortality as well as 
different controls on productivity (e.g., nitrogen, phos- 
phorus, water). Similar patterns of response to distur- 
bance can be produced by computer simulation models 
of plant competition (Doyle 1981; Huston 1994). 

The effects of disturbances on species diversity rep- 
resent the summation of disturbance effects on the dis- 
tribution and occurrence of the many species that are 
potentially present in any particular environment. The 
same basic framework of interacting productivity and 
disturbance can also be used to predict when and 
where particular species are likely to be found. The dy- 
namics of plant succession produce a shifting pattern 
of species distribution across landscapes, particularly 
in situations where productivity is high enough that 
competitive exclusion reduces diversity in late succes- 
sion (Smith and Huston 1989; Smith and Urban 1988; 
Huston 1994). It is important to recognize that the re- 
duction in diversity that potentially occurs as a result 
of competitive exclusion can only be found in groups 
of organisms that can compete with one another. In 
contrast, the diversity of broad groups of dissimilar or- 
ganisms with little potential interspecific competition is 
more likely to increase monotonically with decreasing 
mortality and increasing productivity (Huston 1994). 

In spite of the complexity of many ecological pat- 
terns, it is possible to make accurate predictions 
about ecological phenomena that vary in response to 
environmental conditions, as shown by the recent 
demonstration of the reversal of the effect of grazing 
between productive and unproductive environments 
(Proulx and Mazumder 1998). Both a sound concep- 
tual framework based on an understanding of eco- 
logical processes and a sampling design with the ap- 
propriate resolution, density, and total area are 
essential for making accurate predictions of ecologi- 
cal responses. 


Conclusions 


Our efforts to understand and predict variation in the 
abundance and distributions of species, including the 
diversity of species in any particular area, have been 
hampered by inadequate theories, mismatches be- 


tween processes and sampling scales, and inappropri- 
ate statistical methods. A stronger theoretical frame- 
work that addresses multiple interacting processes and 
limiting factors can help resolve the patterns underly- 
ing the complexity of nature and also contribute to 
identification of appropriate sampling scales and the 
development of statistical methods more appropriate 
for quantifying causal relationships than those that 
have been traditionally used. 

Elements of a conceptual framework for predict- 
ing species occurrence include (1) recognition that 
the interactive effects of multiple limiting factors re- 
quire new statistical approaches for quantifying eco- 
logical processes; (2) matching the spatial and tem- 
poral dimensions of measurements of ecological 
patterns to those of ecological processes for hypothe- 
sis testing and model development; (3) planning of 
sampling designs and model development based on 
the probability that ecological responses may reverse 
and processes change in relative importance along 
the environmental gradients found on all landscapes; 
and (4) recognition that the interaction of population 
dynamics and competitive processes with mortality- 
causing disturbances and other factors that affect 
population dynamics can produce complex responses 
along either disturbance or productivity gradients 
that completely reverse between different environ- 
ments. The complexity of ecological processes should 
be seen as a stimulating challenge, not an insur- 
mountable barrier, to improving our predictions of 
species occurrence. 
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PART 1 


Conceptual 
Framework 


INTRODUCTION 
TO 


PART 1 


The Conceptual Basis of Species Distribution 
Modeling: Time for a Paradigm Shift? 


Raymond J. O'Connor 


Ib setting out to review the conceptual basis of 
species distribution modeling in the light of the con- 
tributions to this volume, I am mindful that any ma- 
ture discipline carries within it what might best be 
characterized as intellectual baggage. The discipline 
has its own set of jargon to provide shorthand to its 
key concepts and their components (Morrison and 
Hall, Chapter 2). Its concepts typically carry within 
themselves a set of assumptions and logical treat- 
ments. A long history in a discipline then honors this 
logical framework as tradition, leaving the fundamen- 
tals all too often unscrutinized for long periods. Fi- 
nally, a mature discipline typically passes through 
stages of being overwhelmed by an accumulated mass 
of contradictions between its conventional wisdom 
and emerging data, leading to a Kuhnian paradigm 
shift. The contributions to the present book reflect all 
of these elements, setting the scene, I believe, for the 
volume as a whole to be both the last hurrah of the 
old way of conceptualizing species distribution model- 
ing as a process of determining and modeling habitat 
correlates under equilibrium conditions and the foun- 
dation for a new paradigm characterizing species dis- 
tributions as nonequilibrium spatial dynamics under 
habitat constraints. Here I want to portray chapters in 
this section as collectively delineating the structure of 
a discipline struggling under the weight of its accumu- 
lated contradictions and as offering signposts to the 
new paradigm. 


The Lessons from History 


An old adage holds that those who fail to study his- 
tory are doomed to repeat its mistakes. In Chapter 3, 
Dean F. Stauffer shows this to have been true of 
species modeling as the *dawning of the quantitative 
ecology era" replaced the ever-growing volume of nat- 
ural history studies that filled the first half of the twen- 
tieth century. One of the problems with quantitative 
ecology is that everyone feels obliged to conform to its 
forms, irrespective of the ability of their data and their 
training to support such activity. A continued supply 
of top-quality natural history studies would define the 
factual corpus of ecology and provide the raw mate- 
rial for rigorous analysis by technically competent 
modelers and theoreticians, a goal endorsed by Van 
Horne (Chapter 4) and Wilcove and Eisner (2000). In- 
stead, though, one is expected to have a hypothesis 
guiding every study, leaving no room for high-quality 
research into the natural history of a previously un- 
studied species or system. As a result, the literature 
has seen a plethora of supposedly quantitative studies 
that cloud more than they illuminate: one cannot 
make progress in science if models are built on un- 
specified or unchecked assumptions and if generaliza- 
tions are attempted across model results supposedly 
comparable but actually disparate (Murray 2000). 
There is, alas, no room in current ecology for rigorous 
determination of ecological facts: despite the majority 
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of graduate ecologists being destined for careers in 
dealing with facts, too often the required formalism is 
met with toy hypotheses exuding statistical jargon. 
The most telling phrase in Stauffer’s historical review 
should give us pause in this respect: “MacArthur 
(1958) published a rigorous and quantitative, though 
not highly statistical [my italics], analysis of niche re- 
lationships of five warbler species.” How often can 
one say that of a modern paper in which all too often 
the high statistical content has crowded out all sem- 
blance of rigorous logic. Michael Austin (Chapter 5) 
reiterates that a knowledge of the distribution of 
species is central to the conservation of biodiversity, 
yet, he states, no amount of statistical modeling can 
compensate for a poorly defined problem. Continued 
development of rigorous statistical approaches to ana- 
lyzing habitat data, assisted by the spread of easy 
computation in the form of computing power and of 
packaged statistical analysis, has been unaccompa- 
nied, even to this day, by corresponding development 
of rigorous logic. 

Stauffer’s conclusion is that despite our powerful 
analytical tools there are limits to the precision of the 
models we develop because of the noise inherent in the 
systems with which we work, and that the simplest 
analyses that meet our objectives will likely be best. I 
emphasize, however, the distinction between “sim- 
plest” and “simplistic.” The use of powerful tools may 
tempt us into the equivalent of modeling the orbits of 
the planets with epicycles, getting ever-closer fits to the 
data but hiding from us the need to develop the equiv- 
alent of Newton’s Laws. 


Failure in the Practice of 
Distribution Modeling 


One of the repeating elements in Stauffer’s history of 
habitat modeling is how individual researchers period- 
ically rose to challenge the prevailing bandwagons of 
near-dogma. Not all were successful. I find it curious 
that, despite their initial impact on the field, Alldredge 
and Ratti’s (1986) urging that models be tested against 
artificial data with known structure is still largely ig- 
nored fifteen years later and has to be reiterated both 
by Stauffer and by Heglund. A handful of researchers 
have attempted io assess model validity and assump- 


tions by examining a crosssection of published mod- 
els, but this goes only so far in protecting the scientific 
community from spurious models. Models always in- 
volve some methodology and its associated assump- 
tions. With an established methodology, the responsi- 
bility of the modeler is to check the validity of the 
underlying assumptions for the particular organism 
being modeled. With any new modeling technique, 
however, a modeler should recognize his or her ex- 
tended responsibility to understand the functioning of 
the model well enough to ensure that it does behave as 
expected. (Van Horne, in Chapter 4, extends this idea 
to the checking of boundary conditions for models.) 
This is possible only through testing with data (artifi- 
cial if necessary) of known structure and by demon- 
strating that the modeling technique correctly recovers 
that structure. After all, most models are constructed 
because the context of their application is too complex 
to be represented by a simple procedure: by that same 
fact one should anticipate that the model is too diffi- 
cult to validate by inspection alone. In many respects 
such testing parallels the *order-of-magnitude" calcu- 
lations of engineering and physics: one re-calculates 
one's work with conveniently rounded numbers (al- 
lowing simple tracking of the computations) to ensure 
that in working through the full specification one 
didn’t do something as stupid as shifting a decimal 
place. The approximate calculation may yield an an- 
swer that is in error by 10, 20, or 30 percent (and a 
further approximate calculation often yields an esti- 
mate of how big this error is likely to be), but an ap- 
proximate answer that is different by a factor of two 
or three indicates the need to revisit the original calcu- 
lation. The recovery of known structure from datasets 
with known (often simple) structure provides equiva- 
lent confidence that the modeling procedure is not in 
major error. (One might note here, too, that checking 
the assumptions underlying even a standard modeling 
technique requires a level of care greater than the cur- 
rent norm among ecologists. Contrary to common be- 
lief, acknowledging the presence of the problems of 
non-normal data or of sample sizes limited by logisti- 
cal resources doesn't justify ignoring the problem 
thereafter.) 

Chapter 8, by Jock S. Young and Richard L. Hutto, 
is another of these periodic challenges to current 
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assumptions. In it Young and Hutto set out to de- 
scribe the use of large-scale exploratory studies in de- 
termining bird-habitat relationships by exploring the 
implications of the various phenomena they encounter 
with their focal species, the Swainson's thrush 
Catbarus ustulatus. The real merits of this chapter are 
its insights into the process of modeling birds in rela- 
tion to habitat, particularly in its general analysis of 
the issues involved in using spatially extensive, data- 
collection programs as the basis of studying birds and 
habitat. Considerable thought is given by Young and 
Hutto to just how variables enter the models under 
development, and they clearly have little trust in the 
structures created by multivariate analysis. First, they 
argue—correctly in my view—that the major value of 
the models is in identifying those variables that are ei- 
ther biologically linked to the bird's abundance or are 
surrogates whose investigation may yet reveal vari- 
ables with biological significance. Second, their logic 
leads inexorably to their conclusion that most statisti- 
cal models are likely to perform poorly when tested by 
management manipulation of the critical variables in 
the model: the manipulation is likely to disrupt the 
very correlation structure that leads to the model in 
the first place. Much of the field of species distribution 
modeling has historically been crippled by a lack of 
understanding of these two points. Indeed, one can 
make a case for this lack of understanding being the 
factor most responsible for the dichotomy (empha- 
sized by Stauffer in Chapter 3 and by Van Horne in 
Chapter 4) between managers and researchers. As is 
inevitable with inductive studies, one cannot extend 
their arguments to conclude that all such studies will 
be compromised in this way but, perhaps more impor- 
tantly, in the light of their study one cannot in future 
blithely assume that correlation structures will persist 
under management manipulation. 

Young and Hutto's signposts to improved modeling 
fly in the face of the current fad for hypothetico-de- 
ductive research. Myopic emphasis on hypothetico- 
deduction analysis acknowledges only those process 
variables causally traceable to the abundance of the 
species and asserts their experimental study to be the 
only way to do science. Such extreme reductionism al- 
lows no value to the broad sweep of extensive statisti- 
cal analysis delineating both the candidate variables 


for process models and the surrogate correlates likely 
to stimulate new paradigms as to process. Since adher- 
ents of these views are often the reviewers of science- 
funding applications, broad exploratory research is 
hard to fund as science. Conversely, such work is 
more easily funded as management-oriented research 
where solutions are needed urgently and in the ex- 
treme the resulting empiricism rejects the need for 
process-based variables on the grounds of pragma- 
tism, hoping that a statistical model will provide the 
desired predictive power. 

Young and Hutto's notion of the destruction by 
management experiment of the very correlations that 
originally generated the predictive models challenges 
this converse and readily explains why so many predic- 
tive models have proved disappointing in practice. 
Granted this, one has to call for a more realistic view 
of the role of hypothetico-deduction in ecology. Just as 
the demise of natural history studies noted by Stauffer 
(and cogently lamented by Wilcove and Eisner [2000]) 
has, to a degree, reduced ecology to a collection of case 
studies, the lack of a large-enough corpus of statistical 
studies identifying which classes and combinations of 
variables are most frequently implicated in wildlife- 
habitat relationships has resulted in a scattering of 
species studies too diverse in scope and methodology 
to generate the broad sweep of pattern that stimulates 
truly general hypotheses. Reductionist science may well 
be the apex of the pyramid of science, but it is founded 
on an ever-widening base of more factual studies as 
one descends the pyramid. Hypothetico-deductive sci- 
ence functions best in fields with a large body of fac- 
tual information, and in the case of wildlife habitat 
modeling this fact needs to be recognized as much by 
science funding as by management funding. Little is 
gained by the creation of toy hypotheses to justify as 
hypothetico-deductive those studies that have intrinsic 
value as providers of factual information. Young and 
Hutto's message regarding the breakdown of statistical 
correlations under management manipulation is both 
fine evidence about the need for experimental hypo- 
thetico-deductive testing of causal models and a strong 
case for supporting exploratory correlative research to 
inform such experiments. Young and Hutto's chapter 
does much to clarify the strengths and weaknesses of 
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complementary hypothesis-based and empirical re- 
search in this field. 


Logic and Reasoning in 
Distribution Modeling 


Chapter 7, by Kristina E. Hill and Michael W. Bin- 
ford, offers a sweeping critique of the quality of logi- 
cal reasoning in contemporary habitat modeling that 
complements the concerns of Hutto and Young and of 
Stauffer. Although parts of their discussion are framed 
in terms of the need to meet legal constraints when 
land use planning based on habitat models enters the 
courts, their emphasis on examining the fundamental 
assumptions of the models rather than on further the- 
oretical proliferation is salutary, reinforcing the same 
arguments touched on in this section by Stauffer, by 
Young and Hutto, by Van Horne, by Smallwood, by 
Austin, and by Huston in the volume's Introductory 
Essay. Their logic itself results in distinguishing be- 
tween two classes of models: (1) forecasting models 
based on linking known occurrences to predictor vari- 
ables that may then estimate the probability of finding 
the species present at other locations, and (2) ex- 
ploratory models that in effect model habitat potential 
on the basis of environmental similarity of areas of 
unknown use to conditions in areas of known use. 
This latter class of models help clarify what variables 
might be involved in causal linking of habitat to spe- 
cies occurrence, paralleling the similar role specified 
by Young and Hutto for multivariate exploration in 
their context. Hill and Binford (Chapter 7) also antic- 
ipate Young and Hutto's (Chapter 8) conclusion about 
unstable correlation matrices in their excoriation of ad 
hoc techniques for modeling. They do so, however, 
with a qualification rare in ecology, that there exist 
theories of measurement that can guide the formula- 
tion of practical models with limited input of ecologi- 
cal theory. This is an area in which most ecologists are 
totally untrained, resulting in a world-view that sees 
variability among organisms or plots as paramount 
and measurement issues as negligible or unimportant. 

One other element in Hill and Binford's treatise— 
raised there only in passing—deserves mention here. 
Although the logical and practical difficulties of using 
inferential correiation and regression statistics as the 


basis for ecological modeling are widely (though not 
universally) appreciated, there are also significant eth- 
ical issues in the use of these methods in testing mod- 
els intended for use in planning and management. As 
any experienced reviewer can testify, all too often the 
only guiding ethic in the writing of ecological papers 
appears to be the need to conform to the formalism of 
a “scientific” hypothesis within the paper. How often 
does the introduction to a habitat modeling paper 
imply great potential in the approach being presented 
as a tool to solve hypotheses about all sorts of man- 
agement or conservation problems, only to show in 
the discussion section little evidence of any concern as 
to whether the model presented is actually suitable for 
its stated purpose? To be sure, caveats are expressed, 
reservations entered, and caution urged, but these are 
fig leaves protecting authors from being crucified for 
failing to discuss limitations: the mind-set appears to 
be, *If the manuscript gets past the reviewers it must 
be all right." It is worth considering, especially in the 
light of Hill and Binford's comment, whether there 
may be serious ethical questions to be addressed as to 
how we typically represent the utility of our modeling 
work. 


Statistical Tools and Techniques 


Several chapters in this section, including those by Pa- 
tricia J. Heglund (Chapter 1), and Beatrice Van Horne 
(Chapter 4), and also the volume's Introductory Essay 
by Michael A. Huston, address the underlying as- 
sumptions made in technical approaches to species 
distribution modeling. The statistical techniques typi- 
cally used in such modeling fall into a small number of 
categories: a group of related regression and correla- 
tion techniques, including simple and multiple linear 
regression and various multivariate methods; logistic 
regression; GAP (Gap Analysis Program) and habitat 
suitability index (HSI) methods; and various so-called 
modern regression methods. Each technique carries its 
own set of assumptions about the data it models. 
These chapters and Huston's essay each focus on the 
major shortcomings of current understanding of these 
assumptions. 

Heglund and Van Horne emphasize that too 
much of extant modeling effort is based on relatively 
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uncritical application of statistical techniques to the 
distribution modeling in hand, at the expense of the 
sound knowledge of environmental processes and an 
, understanding of the nature of species response func- 
tions that are at the heart of robust models of such 
distributions. A glance at the standard works on 
wildlife modeling shows that the dominant model in 
use is one of linear regression. However, anyone using 
linear regression as the basis of distribution modeling 
is essentially making a particular set of ecological as- 
sertions. If the independent variable in the regression 
is X (say), these assertions include (1) the species is 
limited by the amount of habitat of type X, (2) dou- 
bling X will induce a doubling of the species abun- 
dance (subject to the modification implied by any in- 
tercept), and (3) the species distribution is in 
equilibrium with X. Note that these assertions are eco- 
logical assumptions, not statistical assumptions. Hus- 
ton's chapter elaborates how spatial variation in limit- 
ing factors introduces variance that obscures, and 
even biases, the outcome of regression analysis. The 
assumption of equilibrium is, however, even more 
likely to be problematic: for most species population 
levels are likely to be below equilibrium levels as a re- 
sult of the vagaries of year-to-year impacts of climate 
and weather on overwintering survival and on repro- 
ductive output, and this alters the outcome of any sta- 
tistical analysis (Heglund, Chapter 1; Van Horne, 
Chapter 4). The principles identified by Huston as to 
spatial variation in limiting factors are also as relevant 
to nonequilibrium population analysis. Similarly, the 
assumption of linear response to change in X carries 
implicit assumptions about scale of per capita re- 
sources, an issue illustrated in more detail in one of 
K. Shawn Smallwood's examples in Chapter 6. What 
is extraordinary is how poorly understood the limiting 
natures of the assumptions of this model are, despite 
the widespread reliance on regression. 

Logistic regression has become popular recently as 
its use in published papers has proliferated. As Van 
Horne cynically but correctly notes in Chapter 4, one 
form of habitat modeling becoming more popular 
than another always deserves to be questioned to de- 
termine whether the replacement constitutes progress. 
Given that logistic regression can model the binary 
variable of presence or absence, its use really consti- 


tutes no more than the extension of regression meth- 
ods to data at a lower level of measurement. In addi- 
tion, assumptions about a binary response need to be 
examined in the light of Kunin and Gaston's (1997) 
useful formulation of the *area of occupancy" con- 
cept: if there are *holes" within a range boundary, 
should one model the species being potentially any- 
where within the range or as being precluded from oc- 
curring within the holes? This question also raises the 
issue of nested correlates: if range is set by long-wave- 
length variables and occupancy by shorter wavelength 
variables, the two cannot be simultaneously captured 
as a binary response. Finally, the issue of equilibrium 
raised above continues to be critical, and the statistical 
assumptions inherent in an underlying linear function 
are now replaced with the functional assumptions of 
the logistic. Thus the use of logistic regression still re- 
quires the analyst to confirm the validity of the as- 
sumptions made, a requirement largely ignored in the 
studies published to date. Indeed, logistic regression is 
in danger of being viewed as *the silver bullet which 
will solve all model building problems" all too often 
sought in wildlife research (L. McDonald personal 
communication). 

GAP and HSI models have been widely used to 
characterize species distributions. What they strictly 
predict, however, is an envelope of environmental and 
vegetation requirements within which the species may 
occur. These models have little utility for tracking 
changes in distribution. Both GAP and HSI map com- 
ponents of the environment have been shown to be 
linked to the presence or abundance of the species of 
interest at a certain time or in a certain region, but 
their methodology does not ensure that the species- 
component relationship will persist over time or 
across regions: mapping the amount of hedgerow on 
farmland may be well correlated with the abundance 
of a small songbird species when a GAP assessment is 
prepared, but only the hedges, and not the birds, per- 
sist if the farm is sprayed with a pesticide toxic to 
birds. Obvious as this may seem, the assumption is 
regularly overlooked whenever workshop discussion 
turns to tracking distribution over time. That three 
chapters here—those of Young and Hutto, Heglund, 
and Van Horne (Chapter 4)—find it necessary to point 
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this out is a serious indictment of the quality of cur- 
rent modeling. 

Heglund portrays the advantages of generalized lin- 
ear models (GLM) and generalized additive models 
(GAM) as regression procedures suited to distribution 
modeling. Even so, the potential of GLM and GAM 
approaches are limited by issues associated with mul- 
ticollinearity and with limitations of stepwise regres- 
sion. The application of these methods to distribution 
data introduces a new element, the need to take into 


account the spatial autocorrelation intrinsic to distri- - 


bution. Spatial coherence typically inflates the signifi- 
cance of the GLM/GAM predictors and may also bias 
them. For now.a critical safeguard with these models 
is to plot the residuals from the models so as to reveal 
spatial patterning. 

Heglund also emphasizes that the environmental 
predictors used in distribution modeling must be eco- 
logically meaningful. Even then, the long wavelengths 
of environmental data mean that these approaches 
will achieve relatively high predictive success over 
coarse resolution but are still unable to successfully 
predict the local occurrence of the species within the 
general envelope thus identified because local issues 
are dominant locally. As I noted above, this is a signif- 
icant and all-too-often-overlooked problem for logis- 
tic models. 

Techniques are available to discriminate among the 
performances of the potential models listed above. As 
John Wiens points out in the concluding chapter of 
this volume, the use of Akaike's information criterion 
(AIC) has considerable potential in evaluating the rel- 
ative merits of competing models. Burnham and An- 
derson (1998) and Anderson et al. (2000) summarize 
many of the major problems associated with simple- 
minded hypothesis testing of models and provide an 
excellent introduction to the use of AIC in the simul- 
taneous evaluation of multiple models. 


The Need for a Paradigm Shift 


The above sweep through the major arguments con- 
cerning the conceptual basis of our discipline shows 
that limitations of current concepts are marked and 
aggravated by generally poor implementation of the 
available approaches despite a long tradition of in- 


formed commentary by the more percipient practi- 
tioners. So where do we go in the face of a history of 
wise advice about this litany of modeling problems 
that has failed to resolve the problems? Huston's in- 
troductory essay and Smallwood's chapter develop 
thoughtful accounts of how ecologists repeatedly fail 
to understand ecological patterns and instead wind up 
applying inappropriate statistics to quantify them. 
One issue is that the scale (extent and grain) at which 
analysis is conducted is often wrong for the scale of 
the processes of interest, for sparsely distributed sam- 
ple points may characterize gross distribution but not 
cast light on local processes. Smallwood argues that if 
the laws of thermodynamics primarily influence spa- 
tial distributions of animals, then assuming a random 
distribution as the null pattern is inappropriate and 
reduces an analyst's capacity to recognize meaningful 
agreggations. He further challenges us, as modelers, to 
make certain we understand issues of scale and their 
influence on demographic organization. Far more sig- 
nificant in my view is Huston's emphasis on factors as 
limiting agents to species abundance. Huston's argu- 
ment contains two key assumptions. The first is that 
most species can be limited by any of a variety of fac- 
tors. The second is that the influence of any given 
ecologically relevant factor is not additive to the influ- 
ence of any other factor; instead, typically only one is 
limiting in any particular situation. Huston's essay 
shows how misleading correlative analyses can be 
under these assumptions, even to the point of revers- 
ing the direction of the effect on abundance attributed 
to any given factor. It is my belief that Huston's re- 
view, augmented by the points raised by the other au- 
thors I cited above, is actually the recognition of the 
accumulation of a critical mass of contradictions 
within the prevailing paradigm that immediately pre- 
cedes its overthrow (Kuhn 1970). I want now to use 
these ideas to provide the basis for a paradigm shift in 
how we model species distributions. 

Suppose we view any data distribution with respect 
to any factor of interest (a habitat or environmental 
factor) not as a function of that factor as a correlate 
but as the simultaneous outcome of multiple factors 
only one of which is limiting in any particular loca- 
tion. Figure P1.1 presents some artificial data to illus- 
trate how allowing the limiting factor to change from 
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place to place translates into real world datasets. For 
this illustration, we assume that there are three poten- 
tial limiting factors influencing the abundance of a 
species (implied by the height of the curve) and labeled 
X1, X2, and X3. The figure plots abundance data from 
seventeen locations against the value of variable X; on 
the abscissa and shows a unimodal response envelope 
to the maxima. Unimodal responses are regularly mod- 
eled—for example, Jongman et al. (1987)—but as a 
function rather than as a bounding envelope of the type 
discussed by Huston and regularly seen in habitat data 
(e.g., O'Connor and Shrubb 1986 p. 137; O'Connor 
1987a p. 40). For fourteen of the locations the abun- 
dance of the species is limited by variable X;, the least 
severe of the three factors. Point A is a location in 
Which all factors are favorable, and the species is in op- 
timal conditions with respect to X5, so abundance is 
greater there than anywhere else. At location B abun- 
dance is only half that of location A because the (sub- 
optimal) value of X5 is even more limiting than at A 
(and neither X; nor X; are limiting). At location C, 
however, factor X; is optimal but abundance is lower 
than at site A because factor X; is limiting. At location 
E factor X; is suboptimal but would have allowed 
abundance to reach the same value as in location B 
were X» not also suboptimal. We would see this limita- 
tion by factor X; manifest in the data points for loca- 
tions C and E lying at the envelope edge in a plot of 
abundance against variable X, (as sketched in Fig. 
P1.1) if we knew enough to exclude the fourteen loca- 
tions not limited by X;. Finally the most severely limit- 
ing variable X, acts only in a further subset of locations 
(here of just location D), preventing abundance rising to 
the level of site A even though conditions in X3 and X; 
are optimal. 

The model of Figure P1.1 formalizes the thinking in 
Huston’s introductory essay in respect of data pat- 
terns, and a few further comments are in order. First, 
the envelopes for factor X? and X; lie along axes or- 
thogonal to that of X3. Second, and more critically, 
their envelopes, when rotated, will not form the nested 
curves quite as illustrated. All fourteen locations along 
the X3 envelope will then lie in a column centered on 
the optimal value of X2 but at abundance levels dic- 
tated by the value of X5 they experience. The same is 
true also, in turn, in respect of X4. This is difficult to 
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Figure P1.1. Artificia! data illustration showing how allowing 
the limiting factor to change from place to place translates into 
real world data sets. 


portray visually while retaining display of the limita- 
tions imposed at locations C, D, and E. One should 
also note that although this portrayal resembles 
Brown's (1995) idea of maximum abundance occur- 
ring where multiple factors are simultaneously favor- 
able, the present world view of the maximum being 
where all limiting factors are simultaneously permis- 
sive has very different implications from their being 
favorable. Brown's prediction that densities should in 
general be unimodal within a species range—a pattern 
that is certainly not commonplace (Maurer 1990)— 
now gives way to a more topographically diverse sur- 
face generated by Huston's pattern of shifting limiting 
factors. 

How might one implement the analysis in practice? 
Plot abundance against each variable of interest and 
find the data distribution most similar to that of Fig- 
ure P1.1 for variable X5. Determine the values of 
abundance at the envelope edge using one of the edge 
identification techniques discussed by Huston (though 
I would opt for fitting a unimodal curve rather than a 
quantile linear regression) and express abundance at 
all locations as a percentage of the envelope maximum 
at that X3 value. (For example, locations A and B 
would score 100, location C about 50, location E 
about 40, and location D about 20). Each score will 
have some uncertainty associated with the precision 
of the edge determination algorithm. Repeat this 
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independently for the other candidate limiting vari- 
ables. Considering the plot against X5, since all the 
points limited by X; are located in a column at the op- 
timal (least limiting) value of X? but at levels deter- 
mined by the X; constraint, point A will still earn a 
score of 100 but point B will now earn only about 50 
and so on for the rest of the fourteen locations. All 
other points (here only three are illustrated) will score 
values determined both by their edge or interior posi- 
tion and by the scaling imposed by location A's abun- 
dance serving as curve maximum, with points such as 
E yielding scores of 100. Thus, if enough locations 
limited by X? are in the sample, they generate a group 
of scores of 100. This logic prevails recursively. With a 
uniform density of points across a plot such as Figure 
P1.1, histograms (one for each factor) of these scores 
against abundance will then yield an ordered series, 
with the least-limiting factor uniquely yielding scores 
of 100 at the highest abundances (here, from the seven 
points above the B-C level). A block of additional 
scores of 100 then enters in the histogram for the sec- 
ond least limiting factor (though X; still generates 
100s at these abundances because of points such as B 
and the three below it and to its left), and so on. Re- 
cursively removing points limited (scores near 100) by 
the outer variables will reduce successive overlaps in 
the original, facilitating identifying the sequence of 
limiting variables. 

Do we have any evidence that Huston's arguments 
can be used to synthesize such a dramatically different 
way of accounting for species distributions? I believe 
that we do. My colleagues and I, among others, have 
recently modeled a variety of species and species rich- 
ness distributions by use of classification and regres- 
sion tree (CART) analysis. What is valuable about 
CART analysis in the present context is that the data 
are recursively partitioned into subsets on the basis of 
constraints rather than correlates. The occurrence or 
abundance of a species at each binary split into sub- 
sets is greater or smaller according to whether a vari- 
able exceeds a stated value; those subsets are again ex- 
amined for further predictive rules, and so on, 
eventually yielding (on the basis of statistical opti- 
mization rules) a series of subsets—in our data typi- 
cally regional in distribution—of sites in which abun- 
dance is predicted conditional on one or more 


constraint rules. Such rules are of the type that July 
temperatures exceed 25 degrees Celsius, that elevation 
be below 500 meters, that cropland forms no more 
than 15 percent of the local land cover, or that average 
forest patch size exceeds 2,500 hectares. Good exam- 
ples of such models are available for plants (Iverson 
and Prasad 1998), lizards (Hollander et al. 1994), 
birds (O'Connor et al. 1996, 1999; Hahn and O'Con- 
nor, Chapter 17), amphibians (Guerry 2000), and but- 
terflies (Lawler, Bartlett, and O'Connor in prepara- 
tion). In addition, the use of GAM models described 
by Heglund in Chapter 1 can be regarded as defining 
species distributions in terms of constraint envelopes. 
Since these CART and GAM models are data hungry, 
they cannot readily be implemented with the relatively 
small datasets typically found in species habitat stud- 
ies. The approach sketched out from Figure P1.1 
seems to have potential with these smaller datasets, 
though much remains to be filled in to achieve a ro- 
bust analytical implementation of that thinking. In 
particular, the treatment of variance associated with 
census uncertainties and sampling, and the treatment 
of nonequilibrium populations in the analysis require 
detailed consideration. It is also important to recog- 
nize that implementation of my constraints model in- 
volves statistical and analytical assumptions that will 
be wrong in particular cases. Time, and experience 
with the application of the concept to particular cases, 
will inevitably yield examples where constraint models 
fail. Meanwhile, though, the concept of constraints 
models provides a unified framework within which to 
think about many of the issues raised in this book— 
the nonequilibrium situations emphasized by Wiens, 
the hierarchical action of effects with different spatial 
wavelengths and boundary conditions (Van Horne, 
Austin, Chapters 4 and 5), understanding demo- 
graphic organization in space (Smallwood, Chapter 
6), the relative utility and limitations of forecasting 
and exploratory models (Young and Hutto, Hill and 
Binford Chapters 8 and 7), the regular (but generally 
partial) success of very disparate classes of models 
such as HSI and GAP models and the various linear 
and nonlinear regression models (multiple authors), - 
and the need to reconcile complexity of data patterns 
with simpler causal processes (Stauffer, Chapter 3). 
Perhaps most important of all is that it raises new 
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questions that do not emerge naturally from the older 
concepts of habitat correlates: as just one example, 
can Brown and Maurer's (1987) ideas of animal abun- 
dance peaking where multiple environmental factors 
coincide in being favorable be reconciled with Kolasa 
and Pickett's (1989) theory of hierarchical community 
organization solely by nesting Brown and Maurer's 
factors by wavelength (i.e., with smaller wavelength 
factors nested within long wavelength ones)? 

In summary, therefore, it is time to acknowledge the 
need for a scathing indictment of the poor practices 
that have accumulated in species distribution modeling 
over the decades. The chapters I discuss above collec- 
tively reiterate the well-known problems with such 
modeling, but these are problems that have chronically 
been ignored by practitioners. I believe that this situa- 
tion results from the underlying models being incor- 


rect. Most of the authors I cite above identify problems 
with the underlying assumptions we make in our 
analysis of distributions but, absent an alternative par- 
adigm, such work in essence asks us to be more cau- 
tious about our use of results. It is not until one takes 
up Huston's alternative that one has in place the ingre- 
dients for an alternative world view. The arguments 
presented from Figure P1.1 assert that analysis of con- 
straints rather than of correlates is the only paradigm 
that accommodates the logic of local carrying capacity 
being set by the most severe of the multiple alternative 
limits. The logical evidence for this view is thoroughly 
developed by Huston; the empirical evidence is evident 
in the regression trees and GAM results to hand to 
date; and the only puzzle is why it has taken so long to 
acknowledge that a habitat constraint paradigm must 
replace that of habitat correlates. 
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CHAPTER 


1 


Foundations of 


Species-Environment Relations 


Patricia J. Heglund 


Ecology is a science of contingent generalizations, 
where future trends depend . . . on past history and 


on the environmental and biological setting. 
—Robert May 


H istorically, humans have used a combination of 
intuition and observation to locate resources for 
food and shelter. Over time we have sought to quan- 
tify the patterns that we see and to pass this informa- 
tion along to others. Our ability to detect habitat use 
patterns and to determine their cause is strongly af- 
fected by the scale of space or time at which they are 
studied (Wiens 1989c). Today more than ever, effec- 
tive conservation actions will often require specific, 
predictive, accurate models of relations between 
species and their environment (O’Neil and Carey 
1986; Stauffer, Chapter 3). In 1984, the international 
symposium Wildlife 2000: Modeling Habitat Rela- 
tionships of Terrestrial Vertebrates was held at Fallen 
Leaf Lake, California, to discuss the development and 
application of models used in predicting the response 
of wildlife to habitat changes (Verner et al. 1986b). 
The focus of the Predicting Species Occurrences: Is- 
sues of Scale and Accuracy symposium, fifteen years 
later, was to provide a forum for discussing the current 
state of our knowledge and to identify future direc- 
tions with regard to our ability to develop and assess 


predictive species-environment models. In this chapter, 
I revisit some of the ecological foundations upon 
which species-environment modeling efforts have been 
built as a means of setting the stage for the chapters 
that follow. To do this, I briefly highlight the following 
foundational concepts: (1) that space and time will in- 
fluence the patterns we see; (2) that relations between 
species and their habitats are essentially nonlinear and 
depend on the scale of our investigations; and (3) that 
the dynamical nature of species distributions will in- 
fluence the generality and accuracy of models over 
space and time. I then briefly discuss which founda- 
tions I consider weak and require additional work. 


Overview of Species-Environment Models 


Species-environment studies examine the general envi- 
ronmental characteristics associated with the distribu- 
tion of a given species. Models based on measures of 
these conditions are used to predict patterns in species 
diversity as well as to make inferences about the at- 
tributes and adaptations of the species for which they 
have been measured. Underlying species-environment 
models is the premise that predictable relations exist 
between the occurrence of a species and certain fea- 
tures of its environment (the niche as defined by Grin- 
nell 1917; the niche-gestalt of James 1971) and that 
the distributions of species have adaptive significance 
(Hildén 1965; Rosenzweig 1981). 
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Models may either address single species or more 
complex multispecies assemblages when identifying 
relations between occurrence and associated environ- 
mental features at a variety of scales (local to biogeo- 
graphic). Examples of species-environment models in- 
clude prediction of species occurrence, distribution 
and abundance using habitat suitability, habitat capa- 
bility, pattern recognition, and wildlife-habitat rela- 
tions models (Morrison et al. 1998). These models en- 
compass an overwhelming array of statistical and 
nonstatistical methods employed in the assessment of 
species in relation to their environment. Modeling 
techniques include, but are not limited to, expert opin- 
ion, correlation, ordination, gradient analysis, recipro- 
cal averaging, multidimensional scaling, linear and 
nonlinear regression, as well as numerous other multi- 
variate methods (Gauch and Chase 1974; Verner et al. 
1986b; Morrison et al. 1998; Austin, Chapter 5; Jones 
et al., Chapter 35). 

Geographic information system technology allows 
ecologists to extend species-environment models into 
a spatial dimension (Goodchild, preface this volume; 
Scott et al. 1993; Knick and Rotenberry 1998). Again, 
a wide variety of statistical techniques used in model- 
ing and mapping occurrence, diversity, or probability 
of use exist, including more recent innovations such as 
gap analysis, genetic algorithm for rule-set production 
(GARD), artificial neural networks, and fuzzy set the- 
ory (Scott et al. 1993; Knick and Rotenberry 1998; 
Hill and Binford, Chapter 7; Rotenberry et al., Chap- 
ter 22; Lusk et al., Chapter 28). Many of these tech- 
niques hold much promise for extending the predictive 
capability of models from simple presence/absence 
and abundance to modeling population vital rates in 
response to fluctuating environmental conditions 
(Landstrom et al. 1998; Spitz and Lek 1999). 


Temporal and Spatial Patterns 


A guiding principle of resource selection is that species 
use among environments should coevolve with the 
qualities of those environments (Hildén 1965; Rosen- 
zweig 1981). Many species-environment studies focus 
on habitat relations, but clearly there is more to the 
occurrence of a species than habitat alone. At the most 
basic level, a knowledge of biogeography and evolu- 


tion are essential to creating good predictive models 
(Morrison et al. 1998). The distribution and abun- 
dance of organisms we see today were set in motion 
millions of years ago, influenced by climate, plate tec- 
tonics, and competition. Thus, a major goal in ecology 
is to develop our understanding of the factors that de- 
termine the patterns of distribution that we see 
(MacArthur 1972b). 


Time Considerations 


Our ability to detect resource selection patterns and to 
determine their cause is strongly affected by the tem- 
poral scale or time.frame within which they are stud- 
ied. Time plays an especially important role in how we 
interpret patterns of resource use. The amount of time 
available to an organism for habitat selection is highly 
variable. Constraints on an organism's time may be re- 
lated to the short-term nature of an important food re- 
source, or, in the case of high latitudes, to the amount 
of time available for reproduction (Orians and Witten- 
berger 1991). Time constraints may also be placed on 
an organism by social pressures wherein better habi- 
tats are occupied earlier than poorer ones (Brown 
1969a; Fretwell and Lucas 1969). In addition, cues 
available at the time of selection may not reliably fore- 
cast the availability of future resources (Orians and 
Wittenberger 1991). Populations in variable environ- 
ments periodically face unpredictable conditions 
(*Ecological crunches" after Wiens 1977) that affect 
their reproductive success in a given area. These eco- 
logical crunches may strongly influence detectability 
and patterns that we see within a given time frame. 
This limits the usefulness of short-term studies and 
may result in misleading information (Wiens 1981c). 


Spatial Patterns 


The assessment and prediction of a species occurrence 
depends on the resolution of environmental patchiness 
in relation to the scale of exploitation by that species 
(Levins 1968). Dispersion patterns change as the size 
of the area analyzed is changed such that over large 
areas organisms may appear aggregated whereas over 
smaller areas territoriality may lead to a uniform dis-- 
tribution (Whittaker 1975). These reorganizations re- 
flect both intraspecific and interspecific social interac- 
tions and the scale of habitat patchiness or other 
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resources on which species depend. Differences in 
scale ultimately influence the questions that can be ad- 
dressed, the sampling procedures followed, the obser- 
vations obtained, and how the results are interpreted 
(Wiens 1989b). 

Models developed at one scale and applied at a dif- 
ferent scale may lead to misleading results (Wiens 
1989b; Orians and Wittenberger 1991; Rotenberry et 
al., Chapter 22). For example, assume we make a sim- 
ple examination of a species occurrence in relation to 
a given local habitat feature, and we only sample a 
portion of the response function that characterizes 
that gradient. The “true” relationship between the 
two variables may be quite different from the relation- 
ship identified for any portion of the gradient (Wiens 
and Rotenberry 1981a; Wiens 1989b). Detecting fac- 
tors used by individuals to select habitats depends on 
the existence of enough variability among habitats 
such that they could affect selection. If inappropriate 
scales of sampling and analysis are used, key factors 
involved in species-environment relations may not be 
detected (Orians and Wittenberger 1991). Our investi- 
gations may need to encompass a more widely distrib- 
uted sample of sites that characterize the full range of 
an environmental variable to better understand a 
species response to that variable. Thus, it may be as 
important to sample areas where a species does not 
Occur as it is to sample where a species does occur. In 
addition, patterns evident at a biogeographic scale 
may be a consequence of events at a local scale and 
hence an understanding of local events is necessary to 
interpret coarse-grain patterns correctly (Wiens 
1989b). Alternatively, patterns evident at a fairly fine 
grain may reflect widespread disturbances or the ac- 
tual distribution of critical resources (Orians and Wit- 
tenberger 1991; Rotenberry 1981). 


The Importance of Scale in Modeling 


Distributional modeling has historically been con- 
ducted at four scales. At the continental, or biome 
scale, biogeographers frequently define the range of a 
species based on presence/absence or they portray pat- 
terns of species richness. The foundational concept for 
these coarse-resolution models resides in resource se- 
lection theory (Rosenzweig 1981). According to cur- 


rent theory, habitat selection occurs on two levels: ul- 
timate and local. (Hildén 1965; Cody 1985b). At the 
ultimate or evolutionary level, selection for specific 
habitats is reflected in differential reproductive rates 
(e.g., species fitness, Levins 1968). The ability of 
species to occupy a given location is ultimately re- 
stricted by aspects of its physiology, ecology, morphol- 
ogy, and behavior (Wiens 1989b). Subsequently, selec- 
tion is influenced by proximal factors that work at 
increasingly finer or more local scales (Hildén 1965). 


Regional Scale 


At a regional scale presence/absence is the most com- 
mon response variable resulting in the modeling of a 
species occurrence within its range and is delineated 
by the distribution of particular habitats, which are 
often coarsely specified (e.g. Kirtland's warbler [Den- 
droica kirtlandii] and jack pine [Pinus banksiana] 
forests). A few studies conducted at a regional level 
have resulted in the prediction of species abundance 
values or species richness, but the value and reliability 
of these predictions are uncertain without validation 
(see Part 4, and Chapter 65). Even fewer studies have 
attempted to predict productivity measures of a 
species or various changes in their population vital 
statistics in relation to environmental characteristics 
or perturbations (see Part 5). 


Local Scale: The Fundamental Niche 


On a proximate level, habitat selection involves imme- 
diately operative factors. A species distribution may 
further be constrained by geophysical events, resulting 
in temporary stochasticity in resource levels. Whether 
or not a species actually occurs in a given location 
may be constrained by a variety of biological interac- 
tions, including competition and predation (An- 
drewartha and Birch 1954; Hildén 1965; Connell 
1980; Wiens 1989b). Thus, the ability of a species to 
occupy a given place is ultimately restricted by species- 
specific features. 

At the local level, species-environment models are 
based on the assumption that an individual selects a 
general location according to certain landscape or 
topographic features. Because food resources, preda- 
tor populations, and climatic conditions vary in time 
and space, organisms may be unable to directly assess 
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critical resources at a given location but rather must 
rely on certain habitat features that are indirect but 
stable measures of these resources (Smith and Shugart 
1987). Local habitat features may serve as cues that 
provide organisms with a means of predicting future 
environmental conditions or evaluating current habi- 
tat suitability (Orians and Wittenberger 1991). Such 
assessment processes can be seen in local, site-specific 
models that describe the occurrence ofa species within 
a plant association; they are typically a function of mi- 
crohabitat features. The development of these models, 
as with all species-environment models regardless of 
scale, is founded on the concept of the niche (Grinnell 
1917; Elton 1930; Hutchinson 1957; Peters 1991; 
Cao 1995). Many ecologists, as a matter of conven- 
ience, view the frequency of distributions of resource 
use along one or more resource axes as characterizing 
the niche (Cao 1995). 

Grinnell emphasized the environmental require- 
ments of a species and considered the niche a funda- 
mental distributional unit of species (Grinnell 1917). 
Elton (1930) later defined the niche as the *role" of 
the species in the community, which is a behavior- 
based concept. This definition highlighted the role 
other species play in shaping the expressed niche of an 
organism. Both definitions are considered conceptu- 
ally vague and years later a quantitative concept of the 
niche was proposed by Hutchinson (1957). Based on 
this concept, the niche is best described by the coordi- 
nates of a species with z-dimensional resource axes 
and combines both the behavioral and the distribu- 
tional concepts of Elton and Grinnell (Cao 1995; 
Morrison et al. 1998). Thus, we generally collect data 
on a multitude of variables within the environment, 
select from among the measures most strongly related 
to the occurrence of the species, and devise models 
that generally describe the location of that species in 
just a few dimensions (MacArthur 1968; Krebs 1972). 
Morrison et al. (1998) suggested we focus species-en- 
vironment studies on what they call a resource axis of 
the niche because animals select habitat only in the 
broadest geographic sense, but they select resources 
within habitats at a very fine resolution. Herein lies an 
important distinction: resources are important to the 
survival and successful reproduction of an individual 
and the predictive strength of our models lies in our 


ability to identify and measure resources relevant to 


the species at hand. 


Local Scale: Characterizing the Realized Niche 


In reality, the fundamental niche is unlikely to be 
seen in the real world because the presence of com- 
peting individuals necessarily restricts a given species 
to a narrower range of conditions—its “realized” 
niche. The foundation of our current modeling ef- 
forts lies in the characterization of a species' realized 
niche rather than simply determining habitat rela- 
tions. Once patterns in resource use have been recog- 
nized for a species or group of species, ecologists 
then attempt to identify specific elements of the envi- 
ronment associated with the occurrence of an indi- 
vidual or species. The numerous niche-gestalt ver- 
sions of species-environment modeling (James 1971) 
generally relate species response in a binomial fash- 
ion (presence/absence) with very detailed independ- 
ent variables. Theoretically, most species should ex- 
hibit a unimodal distribution approximating a 
Gaussian response curve, with a maximum response 
at some point along an environmental gradient 
(Gauch and Chase 1974). However, a number of fac- 
tors (competition, predation, disturbance, etc.) place 
pressure on an organism and the curve narrows to its 
realized niche. It is important to note that the real- 
ized niche and its optimum may differ from the fun- 
damental niche in both location along a gradient and 
in the shape of the response (Austin and Meyers 
1996). Austin and Meyers (1996) noted that the real- 
ized niche could take on a variety of shapes from 
skewed to bimodal. They cautioned that failure to 
recognize the various shapes of response curves may 
result in inefficient or incorrect predictive models. 
Under niche theory we would expect a species re- 
sponse to any environmental variable to be curvilin- 
ear at a minimum. It is therefore somewhat surpris- 
ing that the occurrence and number of species has 
been linearly related to a variety of habitat features 
(MacArthur 1964; Wiens 1989a; James and McCul- 
loch 1990). Linear models may not adequately ex-. 
plain species-environment relations without includ- 


ing quadratic or more complex terms (Meents et al. 
1983)s 
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The Dynamical Nature of Populations 
and Environments 


The approaches described above each provide a fairly 
static picture of populations when in reality species- 
environment relations are dynamic. Many species- 
environment models are based on cross-sectional stud- 
ies of habitat that may be replicated in space, but not 
in time (O'Connor 1986). Populations fluctuate in 
abundance between years in response to a number of 
factors similarly dynamic, including weather, food 
conditions, habitat, predator abundance, and parasite 
loads (Wiens 1989b). These factors may affect the 
overwinter survival and the reproductive output of or- 
ganisms, thus holding breeding densities below carry- 
ing capacity (Van Horne 1983). Variations in popula- 
tions may be cyclic, allowing for the development of 
predictions, or episodic over time and space, restrict- 
ing our ability to make reliable predictions about 
species-environment relations (Wiens 19892). Equally 
important, populations may not respond to environ- 
mental changes in the short term (e.g. Wiens and 
Rotenberry 1985) or even in the long term, leading to 
what Knick and Rotenberry (2000) have called the 
“ghosts of habitats past.” 


Environmental Variation 


Population levels are not static from year to year given 
the variations in environmental conditions that occur. 
Depending on the species, there may be fairly close 
tracking of changes in resource levels (e.g., food) or 
environmental conditions (e.g. precipitation, Wiens 
1989a). However, 
episodic or fluctuate radically, close tracking of re- 
sources is difficult. Alternatively, conditions may fluc- 
tuate in a cyclical fashion within or beyond the lifes- 
pan of individual organisms (e.g., El Niño) making 
prediction more feasible (Wiens 1989b). 

Populations and intra- and interspecific dynamics 


when environments become 


vary from location to location. Numerous hypotheses 
have been proposed to account for spatial variation in 
the number of species (Begon et al. 1990), including 
evolutionary history of an area, climate, disturbance, 
seasonality, energy, competition, and predation. Pre- 
dicting species occurrence and abundance may be dif- 
ficult when populations and their requisite resources 


or competitors and predators vary spatially and tem- 
porally (Wiens 1989b). The environmental patchiness 
or local environmental heterogeneity influences the 
spatial variability of any resource or feature relevant 
to the existence of an organism in that location (Roth 
1976). Such predictive models will require, at least, 
knowledge about individual species-resource require- 
ments and about the production of resources in an 
area (Wright 1983). Detailed information about re- 
sources available to a particular species and how that 
translates into an estimate of abundance for that 
species would ideally include some consideration of 
how much was unavailable due to physical environ- 
mental conditions, competitors, and predators, and 
about the resource requirements of individuals of the 
species (Wright 1983). Knowledge of the temporal 
and spatial dynamics of populations in relation to 
variations within and among their environments is im- 
portant for several reasons. Although annual changes 
in populations may be apparent between sampled 
areas within a single location or among several loca- 
tions within a given habitat, when samples or loca- 
tions are combined and annual changes averaged over 
a broader scale, substantially less variation may be ob- 
served (Wiens 1989b). 


Intraspecific Dynamics 


If a goal of species-environment studies is to identify 
patterns that reveal underlying ecological processes, 
the interpretation of species responses to environmen- 
tal features may be difficult without a knowledge of 
the distribution of key environmental features and the 
density of populations in relation to them (Orians and 
Wittenberger 1991). A broader-scale approach may 
provide an understanding of certain aspects of popula- 
tion dynamics. Brown (1969a) suggested there might 
be limits between a species density and habitat charac- 
teristics because as a population increases the number 
of territorial individuals would eventually reach a car- 
rying capacity set by the size of the habitat patch and 
a minimum territory size. Fretwell and Lucas (1969) 
expanded this model to describe a hierarchy of habitat 
preferences, with the highest-quality (as measured by 
evolutionary fitness) habitats colonized first; that habi- 
tat quality might be density-dependent such that qual- 
ity decreases with increasing density. In this way, an 
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individual breeding in suboptimal but uncrowded 
habitat might have the same fitness (following the 
ideal free distribution) as an individual breeding in an 
optimal but crowded habitat (Fretwell and Lucas 
1969). 

These dynamic concepts are relevant to species- 
environment modeling (O'Connor 1986). Fretwell and 
Lucas’ (1969) hierarchy of habitat preference predicts 
that a greater variety of habitats should be used as 
population density increases and that individuals will 
occupy the best habitats when density is low. Addi- 
tionally, reproductive success should decline as popu- 
lation density increases as more individuals are forced 
to breed in suboptimal habitats (O'Connor 1986). In 
this way, we may have an altered view of significance 
of certain habitat features in relation to the occurrence 
of a species, and species-environment correlations are 
likely valid only at low densities. In addition, nonlin- 
earities in species-environment relations are more 
likely under this scenario. Thus O'Connor (1986) cau- 
tions us to remain aware of the dynamical aspects of 
populations and habitat use and to consider the possi- 
ble impacts of species distributions that may be influ- 
enced by different processes in the development of 
species-environment models. 


Species-Energy Relations 


Strong relations have been demonstrated between 
species richness and temperature, solar radiation, and 
precipitation at the regional scale (Wright 1983). Cur- 
rie (1991) has shown that energy, as measured by ac- 
tual evapotranspiration or potential evapotranspira- 
tion, within a system is an underlying mechanism 
driving species richness. Latitudinal gradients in 
species diversity (Pianka 1966) and in various metrics 
related to the ecology of species, including clutch size, 
have been recognized (Lack 1947). Ashmole (1963) is 
credited with suggesting that latitudinal gradients in 
clutch size might well be related to energy as measured 
as a ratio of summer measures of actual evapotranspi- 
ration to winter measures of actual evapotranspiration 
(Ricklefs 1980). Thus, species richness in any given 
area is ultimately limited by physical constraints such 
as energy availability (Currie 1991) and seasonality 
(Ashmole's hypothesis after Ricklefs 1980). The con- 


cepts embodied in species-energy theory and equilib- 
rium biogeography can potentially be used not only to 
predict patterns of species number but also to address 
in detail the abundances of individual species and their 
probabilities of being present or absent (Currie 1991). 


Modeling in the Absence 
of Adequate Data 


Although a variety of strong relations have been iden- 
tified between species and their environments, the ex- 
tent of a study in time or space will have an influence 
over the relations observed. As modelers, we attempt 
to—and perhaps even believe that we can—perceive 
the environment in the same way as our species of in- 
terest (Best and Stauffer 1986; Morrison et al. 1998). 
Ecologists still do not fully understand how ecological 
and historical phenomena interact to determine diver- 
sity or the exact distribution of a species (Schluter and 
Ricklefs 1993). Currently, we make assumptions re- 
garding the nature of what we consider important en- 
vironmental metrics, whether they be causal or simply 
serve as a surrogate measure with some indirect and 
often unknown relation to a causal factor. 

The life-history information necessary to develop 
reliable ecological models for any local population is 
limited or lacking (Laymon and Barrett 1986). Despite 
the large number of wildlife-related studies conducted 
annually, only a small number report on species- 
environment relationships (Karl et al. 1999). Much of 
the available natural history information largely in- 
volves a few well-studied taxa such as game species 
and other politically important species (e.g., threat- 
ened and endangered species, or species of special con- 
cern, Karl et al. 1999). Large area mapping and mod- 
eling efforts typically do not include collection of field 
data. For mapping purposes, modelers rely on infor- 
mation from the literature. Much of the ecological lit- 
erature is either at a finer scale (grain) than can be 
mapped with today’s technology or it resides in often 
hard-to-find state and government reports or student 
theses, or it simply does not exist (Laymon and Barrett 
1986; Karl et al. 1999). What we end up with are an 
excess of measured variables and a lack of informa- 
tion regarding which ones are important. A thorough 
knowledge of the natural history of species and their 
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life requisites is essential for understanding the rela- 
tions between species and their habitat and for deter- 
mining which particular habitat components are rele- 
vant to the survival and productivity of each species 
(Rosenzweig 1981; Morrison et al. 1998). 


Weakness in the Foundations 


To date, most of the advances in the study of species- 
environment relations have involved refinements of 
analytical techniques (Morrison et al. 1998). Unfortu- 
nately, linear models have frequently been used to de- 
scribe relations between an organism and its environ- 
ment that are essentially nonlinear (Meents et al. 
1983; James and McCulloch 1990). Although strong 
linear relations have been demonstrated among 
species assemblages and a variety of habitat features, 
including vegetation structure and composition 
(MacArthur 1964), the distribution, or response, of an 
organism in regard to a given environmental variable 
is generally considered nonlinear and should be mod- 
eled as such (Whittaker 1975; Gauch and Chase 1974; 
Austin 1976; Heglund et al. 1994). 

A common criticism of species-environment mod- 
els is that most are based on correlations and provide 
little insight pertaining to the mechanisms underlying 
species-environment relations (Capen 1981; Capen et 
al. 1986). A better knowledge of the ecological and 
evolutionary causes of the species environment rela- 
tions we see will lead us to a rationale for selecting 
the appropriate variables to measure. It is critically 
important to understand the relations underlying the 
observed patterns before we can devise or implement 
effective management plans and to do so at the cor- 
rect scale. Morrison et al. (1998) caution that if our 
fundamental approaches to modeling are flawed, the 
analytical tools we use to develop models are of little 
importance. Additionally, we recognize that if correl- 
ative factors are weakly to moderately related to 
causal factors, then their predictive value is likely 
poor, but we seldom test the accuracy of our models. 


This is, in part, due to insufficient field data that are 
most often drawn from a restricted range of observa- 
tions from a small area due to time and cost con- 
straints. These data are then used to develop models 
that are often applied over a wider area representing 
a novel array of environmental conditions (Johnson 
1981b; Green 1979). Many models are developed 
specifically for use at one scale or another, and it has 
long been recognized that extrapolation of results to 
multiple scales is limited by the nature of the data. 
Ideally, models should be developed and tested using 
independent data sets derived from field studies that 
draw samples from the range of populations densities 
and habitats used by each species of interest (Johnson 
1981b). 

Finally, we should keep in mind that the statistical 
distribution of a species may not adequately represent 
the actual distribution of a species. The dynamical na- 
ture of populations and habitats can lead us to de- 
velop predictive models that give an altered view of 
the significance of certain habitat features in relation 
to the occurrence of a species. In future modeling ef- 
forts, we should try to consider all of the possible in- 
fluences on species distributions discussed herein and 
throughout this book and focus more effort on deter- 
mining underlying causes of these relationships. 
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tandardized, operational definitions are essential if 

different workers are to make similar measure- 
ments of similar entities. Operationalization is the 
practical specification, or measurement, of the range 
of phenomena that a concept or term represents. Be- 
cause no definition is so precise that all uncertainties 
about its meaning are removed, initially unnoticed 
ambiguities in the definition will likely lead to misap- 
plication of the concept (Peters 1991:77). Thus, au- 
thors who provide new or modified definitions of old 
terms, no matter how precise those definitions may be, 
are undoubtedly adding to a conflagration of terms 
and further confusion. Standardization and opera- 
tionalization of terms is also important for the users of 
the results of scientific research. 

Growth of ecological understanding depends in 
large part on the exploration of new concepts. As 
noted by Peters (1991:78), it is normal practice for 
good scientists to entertain seemingly irrational, 
vaguely formed, and poorly defined concepts and their 
associated terms, because these concepts and terms are 
often the building blocks that lead to strong theory. 
However, because a concept is a general notion of 
how a process behaves, it is difficult to operationalize 
it. And, unfortunately, because concepts are not open 
to testing until they are recast as testable hypotheses, 
the early products of conceptual development often 
multiply and then are difficult to extirpate (Peters 


1991:79). As developed by Goldstein (1999), when 
conservation planning and management are based on 
novel constructs that are imprecisely defined, then the 
strategy itself is founded in a vacuum. 

Peters (1991:20) thought that although nothing is 
gained by arguing definitions, differences in the mean- 
ings behind words could be important. If a clear, oper- 
ational definition is not developed, different users of a 
term may develop independent, often inconsistent def- 
initions. In this way the original concept diversifies in 
meaning until any single meaning of the concept ap- 
pears restrictive and inappropriate. A concept soon 
carries so many meanings that one can never be sure 
which is intended. Likewise, Fauth (1997) noted that 
definitions are not simply fodder for semantic argu- 
ments. Rather, definitions arise from the operational 
requirements and structure of theory; the challenge is 
to decide which definition best promotes further theo- 
retical development. 

As noted by Peters (1991:81-82), clarity can only 
be achieved if a term is defined at each use, but this 
proliferation of definitions usually only confounds the 
problem. As summarized by Peters (1991:82), Hurl- 
bert (1981) found twenty-seven definitions for 
“niche,” MacFadyen (1957) found seven for *commu- 
nity," and Hawkins and MacMahon (1989) found 
three for *guild." Recently, Hall et al. (1997b) re- 
viewed fifty papers from prominent journals and 
books in the wildlife and ecology fields between 1980 
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and 1994 for uses of the term “habitat.” They con- 
cluded that habitat terminology was used vaguely and 
imprecisely in 82 percent of the articles reviewed and 
that few papers provided definitions. For example, 
they found that “habitat type” was used—without 
definition—to describe both a simple vegetation asso- 
ciation and the habitat used by one or more animal 
species (see our definitions below). 

Thus, there is both theoretical and practical utility 
in establishing precise and consistent definitions of 
terms. This is not to say that definitions cannot be 
changed over time. But, we argue that any substantive 
changes must be well developed, justified, and criti- 
cally peer reviewed before use. Science advances more 
by testing theory than it does through arguing over the 
definition of terms. However, the dramatic creation 
and misuse of terms makes it difficult for anyone to 
apply the results of different studies. And, there is no 
justification for being sloppy with the use of terms, 
nor should we accept vague and self-serving justifica- 
tions for failing to adhere to rigorous standards. 

In this chapter, we first develop a hierarchical con- 
text within which to place ecological studies, and then 
we give definitions following from this context. Defin- 
itions must be developed in this manner; that is, they 
must follow from a theory or concept. There is no 
heuristic value in developing ecological terms in isola- 
tion from a conceptual framework. 

The terms and definitions provided below are 
meant as a guide for all authors whose works appear 
in this volume, as well as for other ecologists. For con- 
sistency, authors have agreed to follow our terminol- 
ogy or to provide a rationale for using alternative def- 
initions. This will assure readers that terms will be 
defined and adhered to throughout this volume so that 
meanings are not equivocal. Additional terms are de- 
fined in the Appendix. 


Hierarchical System Organization: 
Level and Scale 


The terms level and scale are not synonymous (King 
1997:198), and unfortunately, ecologists and geogra- 
phers have used different definitions of *large" and 
*small" scale. Scale refers to the resolution at which 
patterns are measured, perceived, or represented. Scale 


can be broken into several components, including 
grain and extent. Grain is the smallest resolvable unit 
of study, for example a 1x1-meter quadrat. Grain gen- 
erally determines the lower limit of what can be stud- 
ied. Extent is the area over which observations are 
made and the duration of those observations; for ex- 
ample, the boundaries of a study area, a species range, 
or duration of study (Milne 1997). Thus, many char- 
acteristics of wildlife will vary with scale, such as 
habitat, animal density, patch geometry, or resource 
availability (habitat and resource are defined below). 
Different patterns we observe in these characteristics 
with changes in scale reflect transitions between the 
controlling influence of one environmental factor over 
another (Milne 1997). 

Thus, grain and extent should define the scale of 
each study. Because four combinations of grain and 
extent are possible, the term scale should not be used 
without first specifying which combination is in- 
tended: small-scale studies should be of small grain 
and either small or large extent; large-scale studies 
should be of large grain and either small or large ex- 
tent. To define these terms otherwise only leads to 
confusion among workers and makes it difficult to 
place study results in the context of overall theoretical 
considerations (e.g., foraging, movement patterns, 
population control). 

To further complicate matters, “scale” for a geogra- 
pher or cartographer refers to the relationship be- 
tween distances on a map and distances on the 
ground. Thus, 1:24,000 is a larger scale than 
1:100,000. Technical limitations (e.g., amount of time 
and money, computer storage capabilities) limit the 
application of large mapping scales to large areas. 
Thus, the small-grained study of an ecologist can be 
diagrammed or mapped at a relatively large mapping 
scale if the spatial extent of the study area is not too 
large. If the extent is too large, then the study elements 
can only be depicted at a small mapping scale. But, the 
mapping scale is not directly related to the grain of the 
ecological study being pursued. 

Workers should be careful not to confuse mapping 
scales with the ecological scale necessary to investigate 
a concept of interest. Jenerette and Wu (2000) dis- 
cussed these definitions of scale in detail and urged 


ecologists to be explicit when defining their use of the 
term. 

Level is a relative ordering of system organization 
(after King 1997). Invoking the term system implies 
that a set of connected parts form a whole or work to- 
gether in some fashion. The levels of organization in a 
hierarchically organized system are not isolated or in- 
dependent. The elements at one level emerge as a con- 
sequence of the interactions and relationships among 
elements of the next lower level. These interactions 
and relationships can be quantified by differences in 
rate structure, frequency of behavior, frequency of in- 
teraction, and interaction strength. Thus, the higher 
levels provide the context for the lower levels. 

King (1997) concluded that levels of organization 
should always be extracted from data and should not 
be imposed by a priori assumptions about what levels 
of explanation should be. For example, a landscape is 
often considered a hierarchical *level." But, do the 
patches within the landscape interact to form the land- 
scape? It cannot be presumed that they do. Thus, it is 
better not to refer to a landscape as the “landscape 
level" but rather as just *landscape" until interactions 
are demonstrated. “Level” is used to indicate “the 
level of organization revealed by observation at the 
scale (grain and extent) under consideration." 


Definitions 


The following definitions were offered as a common 
vocabulary to authors, and it is hoped they will also 
provide guidance for readers. 


Habitat 


The term habitat is a concept and, as such, cannot be 
tested per se. Habitat is understood, even by the lay 
public, to mean a place where an animal resides. The 
traditional definition of habitat is useless as a predic- 
tive tool; it is a *concept cluster" (sensu Peet 1974) 
with numerous similar, but not identical, definitions. 
There is no *theory of habitat." Rather, habitat is a 
concept that serves as an umbrella under which spe- 
cific relationships between an animal and its sur- 
roundings are stated as testable hypotheses. 

Habitat has a spatial extent that is determined dur- 
ing a stated time period. Thus, the physical area occu- 
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pied by an animal can be described (by the observer) 
by both extent and grain. The various factors we com- 
monly recognize as components of habitat—cover, 
food, water, and such—are contained within this area. 
The direct or indirect use of these factors can be meas- 
ured and quantified; numerous authors (review in Hall 
et al. 1997b) have recognized the term *habitat use." 
A functional relationship between resources (defined 
below) and animal performance is assumed; the ob- 
server often does not define a specific area, or he or 
she might produce a user-defined area that is perhaps 
based on animal activity. Thus, habitat is a convenient 
boundary for measurement of vegetation, various 
other resources, and the environment. A priori, user- 
defined boundaries are convenient but are likely artifi- 
cial, because the resources contained within those spa- 
tial areas will, of course, vary over time both in 
response to abiotic factors and as a result of use of re- 
sources by animals. The spatial extent of the habitat 
and the grain at which measurements are taken should 
be defined to increase communication among workers. 
Descriptions of habitat should consider the dynamic 
nature of the components. Terms such as microhabi- 
tat, mesobabitat, and macrohabitat refer to grain size 
and are usually used to characterize the continuous 
nature of the factors that can be measured within an 
animal’s habitat. Here again, technical limitations pre- 
vent microhabitat from being described and measured 
over a large extent at large mapping scales. Macro- 
habitat usually includes measures such as canopy 
cover and tree density, whereas microhabitat will in- 
clude shrub-stem density and pebble cover. 
Quantifying habitat quality requires discovering 
which relationships determine individual fecundity/ 
survivorship at the appropriate scale (grain and ex- 
tent). The strength and frequency of interactions be- 
tween the individual and its environment define the 
performance of the animal (e.g., survival, fecundity) 
and are considered niche relationships (discussed 
below). Thus, we can draw boundaries around where 
the animal performs activities and interacts with the 
biotic and abiotic characteristics, which can then be 
called the spatial extent of the habitat. Within the spa- 
tial extent we can then define the spatial and temporal 
resolution of our observations—the habitat grain. 
Habitat can certainly be used to develop general 
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descriptors of the distribution of animals. However, 
we fail repeatedly to find commonalties in *habitat" 
for most populations across space because we are usu- 
ally missing the underlying mechanisms (e.g., size dis- 
tribution of prey, forage nutrients, competitive factors) 
that determine occupancy, survival, and fecundity. 
Habitat per se can only provide a limited explanation 
of the ecology of an animal. Other concepts, including 
niche, must be invoked to more fully understand the 
mechanisms responsible for animal survival and fit- 
ness. As we have seen in many wildlife studies (e.g., 
Collins 1983; Mosher et al. 1986; review by Morrison 
et al. 1998), *high-quality habitat" varies in physical 
attributes for a species across its range because it is 
not measuring mechanisms. Rather, our statistical 
models of habitat are, at best, analyzing surrogates of 
these mechanisms. Habitat is a useful concept for de- 
scribing the physical area used by an animal and 
should probably retain its simplicity for ease of 
communication among scientists, managers, and the 
public. 


Niche 


Wiens (1989b:146) called the niche one of the most 
variably defined terms in ecology. Two primary mean- 
ings have been ascribed to it: 

The Grinnellian niche of a species is the set or 
range of environmental features that enable individu- 
als to survive and reproduce. Grinnell's (1917) focus 
was on factors determining the distribution and abun- 
dance of species. 

In contrast, the Eltonian niche pertains to a species’ 
functional role in the community, especially with re- 
gard to trophic interactions (Elton 1927). Hutchinson 
(1957) reinforced and expanded this concept of the 
niche by mathematically describing a large number of 
environmental dimensions, each representing some re- 
source (see below), or other important factor, on 
which different species exhibit frequency distributions 
of performance, response, or resource utilization 
(Wiens 1989b:146). Collectively, the dimensions de- 
fine an n-dimensional space and the frequency distri- 
butions for a species define an n-dimensional hyper- 
volume within this space—its niche. And, according to 
Peters (1991:91), the hypervolume is an infinitely 
large set of properties that cannot be operationalized. 


Subsets of properties can be quantified, however, 
which makes the hypervolume useful conceptually but 
precludes testable hypotheses. 

Grinnell’s niche was thus primarily autecological, 
whereas the Hutchinson niche was synecological and 
was concerned with the relative position of a species 
within a community of species. Thus, the view taken 
results in a different emphasis of study: individualistic 
studies under the Grinnellian view and studies of com- 
munities under the Hutchinson view. As summarized 
by Wiens (1989a:147), there is nothing inherent in the 
Hutchinson concept that excludes incorporation of 
Grinnell’s concepts. The factors emphasized by Grin- 
nell may also be considered as dimensions within the 
Hutchsonian niche. 

Note that under the Hutchsonian niche multiple 
species are assumed to interact within a community; 
the community is assumed to exist. In contrast, Grin- 
nell's objective was to understand population regula- 
tion in terms of resources that limit the distribution 
and abundance of a single species; this understanding 
must precede analysis of communities (Wiens 
1989b:Table 6.1). Hutchinson's view thus can be in- 
terpreted to suggest that a level of organization (the 
community level) exists. As summarized by Wiens 
(1989b:178-179), patterns that emerge from studies 
of groups of species (e.g., *guilds," *assemblages") 
may be consequences of imposing an arbitrary 
arrangement that is actually structured in some other 
way or not at all. 

Hutchinson's view of the niche thus violates our 
adoption of rigorous criteria for identifying organiza- 
tional levels. We agree with Wiens (1989b:176—177) 
that much insightful work has resulted from studies of 
"niches" (as well as of “communities” and “guilds”; 
see below). However, we reiterate that no organiza- 
tional structure should be preimposed before it is de- 
termined from measured observations. 

Arthur (1987) recommended that we follow 
MacArthur’s (1968) quantification of the niche, which 
plots utilization against some quantifiable resource 
variable—the resource utilization function (RUE). 
Arthur argued that it was better to build complexity 
as needed, as with RUFs, rather than to dissect it as 
when using a hypervolume concept. RUFs describe the 
choice of resources by animals; these choices can be 


constrained by predators, competitors, and other fac- 
tors. We think that this type of approach is preferable 
because it makes far fewer assumptions about organi- 
zational structure, and it can be tuned to fit specific 
questions. 

It is not our intent to recommend study designs and 
methods here (see Morrison et al. 1998 for recommen- 
dations on habitat, niche, and resource studies). How- 
ever, we recommend that the following standards 
should be followed when discussing the niche: 


1. Clearly describe the niche concept being assumed 
(e.g., see Wiens 1989b:Table 6.1 for guidance). For 
example, what ecological hierarchy, if any, is being 
assumed? Refer to the above discussion of biologi- 
cal levels for guidance. 

2. Describe the range of resources thought to influ- 
ence the species’ distribution, abundance, or inter- 
actions, and the specific subset being studied. This 
description is, of course, closely related to the de- 
scription under (1), above. 

3. Describe the specific relationship(s) being exam- 
ined or tested. That is, describe the RUFs (or other 
defined terminology). 

4. Clearly separate resources (e.g., food, space, miner- 
als, nest sites) from constraints on the use of those 
resources (e.g., predation, competition, activity 
time). 


Guilds 


Wiens (1989b:156-159) presented a brief but insight- 
ful review of the guild concept. Root (1967) defines 
guilds as “a group of species that exploit the same 
class of environmental resources in a similar way.” 
Wiens (1989a:156) identified three key elements of a 
guild: (1) species are syntopic (co-occurrence in the 
same habitat); (2) similarity among species is deter- 
mined by their use of resources (niche requirements) 
rather than by their taxonomy; and (3) competition 
among species is especially important. Most applica- 
tions of the guild concept have concentrated on sub- 
sets of species within the same family. Jaksic (1981) 
and MacMahon et al. (1981) argued that this misin- 
terprets Root’s (1967) intentions. A resource (e.g., 
seeds of a certain size) is used by a host of species, 
such as ants, lizards, rodents, and birds. Thus, restrict- 
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ing guild membership to only one taxonomic category 
captures only a segment of the potential interactions. 

Jaksic (1981) identified two types of guilds: a com- 
munity guild included all species that were known to 
use a specific resource in a similar way; an assemblage 
guild included species within a specific taxonomic cat- 
egory. He made these distinctions because he thought 
it was not possible to include all species using similar 
resources in a single study. As summarized by Wiens 
(1989b:159), the Root (1967) concept represents clus- 
terings of species overlapping in resource use, whereas 
the Jaksic-MacMahon concept examines how several 
consumers use a resource and how different resources 
share consumers. 

However, Jaksic’s (1981) community guild should 
be avoided because it mixes two related and overlap- 
ping terms. Furthermore, a community can conceptu- 
ally be subdivided into guilds (Wiens 1989b:159); 
thus, a level of organization is implied without the 
benefit of data. Likewise, assemblage guild should be 
avoided because when following Jaksic’s definition it 
gives the false impression that a guild is being ana- 
lyzed (when, in fact, only taxonomically related 
species are under study). 

Thus, we adopt the term species assemblage when 
one is simply studying some group of species for any 
number of interesting reasons. This is the same termi- 
nology that can be used to identify a “community,” 
because in both cases we are simply choosing a group 
of species for which we have no knowledge of the or- 
ganizational structure (i.e., level). As noted by Wiens 
(1989b:159), the primary value of the guild concept is 
to focus attention on sets of species sharing positions 
in niche space, or influencing resource dynamics in 
similar ways, so that the consequences of these simi- 
larities may be determined. We emphasize the latter 
part of Wiens’ comments: it is the consequences of the 
interactions between species and their resources, as re- 
flected in survival and fecundity, that are of primary 
importance in understanding ecological relationships. 
We should avoid artificially assigning a level of organ- 
ization (i.e., calling our work a study of “guilds”) 
when, in fact, no such level has been quantified. As 
summarized by Wiens (1989b:178-179), just because 
an ecological concept such as the guild may be a 
handy way to group species, it is not necessarily 
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biologically valid. This is because of the arbitrary na- 
ture of most guild classifications: “patterns” that arise 
from such studies may be the result of imposing an ar- 
bitrary arrangement on a group of species. 


Community 


A survey of the definitions of “community” (Wiens 
1989b:3—5) shows that the co-occurrence of individu- 
als of several species in time and space is common to 
most. Additionally, most also stress the importance of 
interdependencies among the populations under study. 
Wiens (1989c:257-258) stated that “regardless of 
how one defines ‘community,’ the community being 
investigated and the criteria used to determine its 
membership should be described explicitly." He went 
on to add that we “should not become overly con- 
cerned with the semantics of communities or with 
whether or not communities are ‘real’ or possess holis- 
tic properties." He concluded that multispecies assem- 
blages do occur in nature, and thus we should concen- 
trate on identifying interactions among these species. 
Wiens chose to “accept the operational utility of talk- 
ing about bird communities as assemblages of individ- 
uals of several species that occur together.” We under- 
stand this viewpoint but suggest that Wiens’ 
conclusion does not advance the explicit study of com- 
munities. Rather, it furthers the arbitrary nature of our 
definitions and the haphazard way in which we con- 
duct our “community-level studies.” Likewise, we dis- 
agree with Fauth et al. (1996) that a community can 
be described simply by placing boundaries around a 
study site. We suggest that when the level of the com- 
munity cannot be identified from data, the term 
“species assemblage” be used instead of “commu- 
nity.” And, like Wiens, we urge that the composition 
of this assemblage and its boundaries—no matter how 
arbitrary—be explicitly stated. This is similar to the 
recommendations of Jaksic (1981) regarding terminol- 
ogy applied to guilds. Our arguments given for 
“guild” and “niche” apply here, and we will not reit- 
erate them. 


Population 


All studies must identify the population of inference. 
Population has usually been defined as a collection of 
individuals. However, few studies explicitly identify 


the boundaries of the population under study. Thus, 
the applicability of research results beyond the indi- 
vidual animals actually being studied is unknown. Be- 
cause most species are composed of numerous eco- 
types and often numerous subspecies, extrapolating 
beyond the individuals under study is seldom war- 
ranted and can lead to misuse of results. 

Individual animals interact with resources. Two or 
more animals interact individually and collectively 
with each other (e.g., through copulation, predation, 
and competition) and with the environment, forming 
populations (or subpopulations or metapopulations). 
Thus, the population has an organizational level. 
However, our understanding of this level will largely 
be based on the scale—extent and grain—chosen for 
study. 

Peak (1997:70—72) noted that several major prob- 
lems arise when discussing populations: (1) the num- 
ber of sampled populations may not represent the 
number of relevant populations, and (2) the level of 
aggregation associated with the sample is likely not 
the appropriate level to quantify dynamics. Cooke 
(1997:188-189) noted that the foundation of popula- 
tion studies rests on the claim that a population can be 
characterized from local observations. However, al- 
though the traditional principles of population dy- 
namics are the basis of many resource management 
decisions, treatments are rarely applied to entire popu- 
lations but rather to demes that are treated as if they 
were populations, or to individuals as representatives 
of “the population.” Treating demes, which are sub- 
sets of a population, may be simpler than treating 
populations but may create changes in dynamics that 
are too complex to interpret readily. 

The term “population” does define a conceptual 
unit that has properties that an observer can define. 
The problem arises when we try to establish the do- 
main of inference from population studies. That is, 
having studied a “population” (i.e., some assemblage 
of individuals) and derived values for various meas- 
ures of that population, the problem is to determine to 
what do we apply these measures? It is an example of 
the distinction between statistical description (e.g., a 
mean and standard deviation) and statistical inference 
(e.g., how widely can we apply the statistical descrip- 
tion?). Our above comments on defining a population 


apply primarily to the latter, while the former remains 
a legitimate use of the concept of population (J.A. 
Wiens, personal communication). 

Thus, we recommend that the following standards 
be used in all studies: 


1. Clearly define the group of animals being studied 
with regard to spatial extent (i.e., define your study 
area). 

2. Explicitly describe how the study “population” 
was identified. That is, was a convenient area se- 
lected (e.g., a research station, study plot, island), 
or were ecological data (i.e., demographic or ge- 
netic) actually used to define the population (or 
deme or subpopulation) boundaries? 

3. Discuss how changes in grain and extent might in- 
fluence study results. 

4. Describe the current taxonomic classification of the 
species (i.e., species, subspecies) and known or sus- 
pected ecological classification (i.e., range of envi- 
ronmental conditions occupied, the ecotype). 

5. Describe the potential applicability of your results 
in light of 1-4, above. 


Ecosystem 


Fauth (1997) discussed the concept of ecosystem in 
light of recent ecological theory. As is usually done, 
Fauth concluded that the basic spatial and temporal 
dimensions of an ecosystem were user defined, render- 
ing an ecosystem an adimensional conceptual unit. He 
further noted that ecosystem boundaries could be arti- 
ficial (a square meter of lawn) or natural (a pond), de- 
pending upon one's goals. Fauth then categorized the 
components within this boundary as either biotic or 
abiotic. He defined all members of the biota that occur 
within a bounded area as a community and all abiotic 
factors within the same area as the abiotic environ- 
ment. Interactions within and between these two com- 
ponents form the ecosystem. 

In most cases, researchers will have little knowledge 
of an ecosystem, including its interactions and bound- 
aries. It is probably safer and less confusing to simply 
refer to the scale (grain and extent) of the study and re- 
strict discussion to the functions and relationships ob- 
served in the area. Thus, we recommend that the fol- 
lowing standards be used when discussing ecosystems: 
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1. A level of organization should not be assumed; 
rather, a statement should be made describing the 
specific functions and interactions under study or 
potentially influencing the study. 

2. Other potential influential factors and interactions 
not studied should be mentioned, along with their 
potential impacts on study results. 

3. The spatial and temporal extent of the study 
should be described (see also “Population”), no 
matter how arbitrary. 

4. The term *ecosystem" should be avoided alto- 
gether and instead replaced with the descriptions 
suggested in 2 and 3 above. 


Landscapes 


Landscape can be defined as a spatially heterogeneous 
area used to describe features (e.g., stand type, site, 
soil) of interest. King (1997:205-206) described a 
landscape primarily by its spatial extent. The term 
landscape level is best understood as the level of or- 
ganization revealed by observation at the spatial ex- 
tent of the landscape, but only after that level and the 
associated hierarchical organization have been deter- 
mined from the data collected (King 1997:205-206). 

A serious problem with application of the term 
“landscape” is that it is usually taken to mean rela- 
tively large areas (1-100 square kilometers) that are 
composed of interacting ecosystems (Forman and Go- 
dron 1986; Davis and Stoms 1996). However, the per- 
ception of “landscape” to a small animal is likely 
much different than that perceived by a large animal. 
As reviewed by King (1997:204), the fundamental 
themes of landscape ecology are not scale dependent 
or limited only to spatial extents greater than a few 
square kilometers. Questions of how spatial hetero- 
geneity influences biotic and abiotic processes—a 
theme defining landscape ecology—can be addressed 
at virtually any spatial scale. Thus, by adopting an or- 
ganism-centered view, a landscape can credibly be de- 
scribed using a microscope or a satellite. Thus, we rec- 
ommend that no area limitation be placed on the 
landscape. Describing a landscape in whole kilometers 
will be appropriate for certain applications (e.g., gap 
analysis), whereas describing it in a few square meters 
will be appropriate other uses (e.g., salamander niche 
relationships). 
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Resources 


Wiens (1989c:262) noted that “Although ‘resources’ 
are involved in most explanations of community pat- 
terns, all too often they have been defined in ad hoc 
ways, rarely measured directly, or inferred to be limit- 
ing on the basis of faith rather than evidence.” Many 
of the items we label as resources (or natural re- 
sources) are actually artificial constructs that do not 
have clear biological definitions or-justifications. For 
example, the “rangelands” resource or “rangeland 
vegetation” is composed of numerous abiotic and bi- 
otic properties and cannot be quantified, per se (Mor- 
rison and Marcot 1995). Rather, the actual resources 
in a rangeland include air, water, minerals, soil, sun- 
light, flora, and fauna. Thus, to really define a re- 
source, the area of interest must be explicitly identified 
as to its spatial extent and broken down into its meas- 
urable component elements. 

Little attention has been given to the identification 
and measurement of resources. As summarized by 
Wiens (1989b:321), almost any environmental factor 
that correlates with the distribution and abundance of 
a species has been called a resource. Most studies of 
“resource partitioning” employ circular logic: vari- 
ables that are used differentially by species are termed 
resources, and the coexistence of species is then ex- 
plained by the partitioning of those resources. Further- 
more, Wiens (1989b:323) pointed out that without a 
precise definition of the resources present, it is not pos- 
sible to derive accurate patterns of resource use or 
niche relationships (see Tilman’s [1982] work on re- 
source competition in plants for a possible exception). 

Wiens (1989b:321) provided two critical features 
for the definition of a resource: (1) it must be used by 
the organism, and (2) it must be at least potentially 
limiting to individual fitness and/or population dy- 
namics. He included the limitation criterion because 
his focus was on the potential for competition. We 
think, however, that the criterion of limitation is un- 
necessary for our application. Requiring a resource to 
be potentially limiting invokes competition as an as- 
sumed driving force in species interactions; a level of 
organization is thus required before an item—even if it 
is consumed and is essential to survival—be consid- 
ered a resource. Thus, we instead define a resource as 
any biotic or abiotic factor that is directly used by an 


organism. Resources that are limiting to an organism 
could then be referred to as “limiting resources.” 

Wiens (1989b:321—323) also noted that it is critical 
that the differences between resource abundance, 
availability, and use be distinguished to be certain 
which one is actually being measured. Resource abun- 
dance is the absolute amount (or size or volume) of an 
item in an explicitly defined area. For example, the 
number of food items in a 1-hectare area. Resource 
availability is the amount of a resource actually avail- 
able to the animal (i.e., the amount exploitable). For 
example, the number of food items in a 1-hectare area 
that an ungulate can reach. Finally, resource use is a 
measure of the amount of the resource directly taken 
(e.g., consumed, removed) from an explicitly defined 
area. For example, the number of food items in a 1- 
hectare area that an animal consumed in a six-hour 
sampling period. 

Thus, we recommend that all “resources” invoked 
by an investigator as an ecological explanation for ob- 
served patterns should be clearly identified and dis- 
cussed with regard to their use by the organism(s) 
under study, and careful distinctions should be drawn 
between the abundance, availability, and use of the re- 
sources by the organisms. 


Concluding Remarks 


Definitions have little heuristic value if created outside 
an explicit theoretical framework. Definitions made in 
the absence of such a framework are, in essence, “def- 
initional orphans.” Thus, for the ecological disci- 
plines, we advocate the use of standardized terms 
based as far as possible on operationalized concepts. 
In this chapter we have suggested standardized defini- 
tions for many critical and commonplace terms in 
ecology. Authors may disagree with a definition but if 
so should provide their own justified and operational- 
ized definition so that further ambiguity in the use of 
terminology is avoided. 
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Appendix 


Provided here are definitions of standard terms used in 
studies of the distribution and abundance of animals. 
Terms followed by *see text" are discussed in detail in 
the main text of this chapter. 


Abundance The number of individuals (Lancia et al. 
1994); contrast with Density. 


Accuracy The nearness of a measurement to the actual 
value of the variable being measured; not synonymous 
with Precision (Zar 1984:4). 


Assemblage A group of species under study; see dis- 
cussion of *community" in text. 


Census A complete enumeration of an entity (modified 
from Lancia et al. 1994). 


Community 'The co-occurrence of individuals of sev- 
eral species during a specified time and space that are 
interacting and show some degree of interdependen- 
cies; see text. 

Complexity Relative comparisons of grains separated 
by a given distance. 

Density The number of individuals per unit area (Lan- 
cia et al. 1994). 

Distribution The spread or scatter of an entity within 
its range. 

Ecosystem The specific functions and interactions 
under study or potentially influencing the study 
(within an explicitly defined area); see text. 

Extent The area over which observations are made 
and the duration of those observations; see text. 
Grain 'The spatial and temporal resolution of observa- 
tions; the smallest resolvable unit of study; see text. 
Guild A group of species that exploit the same class of 
environmental resources in a similar way (Root 1967); 
see text. 

Habitat The physical space within which the animal 
lives, and the abiotic and biotic entities (e.g., re- 
sources) in that space (see also Hall et al. 1997b); see 


text. 
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Habitat Availability The accessibility and procurabil- 
ity of physical and biological components in a habitat 
(Hall et al. 1997). 


Habitat Avoidance An oxymoron that should not be 
used; wherever an animal occurs defines its habitat. 


Microbabitat, Mesobabitat Relative terms that refer to 
the grain size of the area over which habitat is being 
measured (see also Hall et al. 1997b). 


Habitat Preference Used to describe the relative use of 
different locations (habitats) by an individual or 
species. 


Habitat Quality The ability of the area to provide 
conditions appropriate for individual and population 
persistence (Hall et al. 1997b). 


Habitat Selection A hierarchical process involving a 
series of innate and learned behavioral decisions made 
by an animal about what habitat it would use at dif- 
ferent scales of the environment (Hutto 1985:458, 
Hall et al. 1997b). 


Habitat Use The way an animal uses (or *consumes," 
in a generic sense) a collection of physical and biolog- 
ical entities in a habitat (Hall et al. 1997b). 


Home Range The area traversed by an animal during 
its activities during a specified period of time. 


Landscape A spatially heterogeneous area used to de- 
scribe features (e.g., stand type, site, soil) of interest; 
see text. 


Landscape Feature Widespread or characteristic fea- 
tures within the landscape (e.g., stand type, site, soil, 
patch); see text. 


Level The level of organization revealed by observa- 
tion at the scale under study (King 1997); see text. 


Metapopulation Strictly, a system of populations of a 
given species in a landscape linked by balanced rates 
of extinction and colonization. More loosely, the term 
is used for groups of populations of a species, some of 
which go extinct while others are established, but the 
entire system may not be in equilibrium (Pickett and 
Rogers 1997). 

Model Any formal representation of the real world. A 
model may be conceptual, diagrammatic, mathemati- 
cal, or computational. 
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Model Calibration The estimation of model parame- 
ters from data. 


Model Parameterization The process of specifying a 
model structure (see Conroy and Moore, Chapter 16). 


Model Validation Comparison of a model’s predic- 
tions to some user-chosen standard to assess if the 
model is suitable for its intended purpose. 


Model Verification The demonstration that a model is 
formally correct. 


Niche The strength and frequency of interactions be- 
tween the individual and entities (e.g., resources, other 
animals) in its habitat; see text. 


Patch A recognizable area on the surface of the earth 
that contrasts with adjacent areas and has definable 
boundaries (Pickett and Rogers 1997). 


Population Classically, a collection of individuals; see 
text. 


Precision The closeness to each other of repeated 
measurements of the same quantity; not synonymous 
with accuracy (Zar 1984:4). 


Range The limits within which an entity operates or 
can be found. 


Resolution The smallest spatial scale at which we por- 
tray discontinuities in biotic and abiotic factors in 
map form (Hargis et al. 1997). 


Resource Any biotic and abiotic factor directly used 
by an organism; see text (see also Manly et al. 1993). 


Resource Abundance The absolute amount (or size or 
volume) of an item in an explicitly defined area. 


Resource Availability A measure of the amount of a 
resource actually available to the animal (i.e., the 
amount exploitable). 


Resource Preference The likelihood that a resource 


will be used if offered on an equal basis with others 
(Manly et al. 1993). 


Resource Selection The process by which an animal 


chooses a resource. 


Resource Use A measure of the amount of resource 
taken directly (e.g., consumed, removed) from an ex- 
plicitly defined area. 


Scale The resolution at which patterns are measured, 
perceived, or represented. Scale can be broken into 
several components, including grain and extent; see 
text. l 


Scale of Observation The spatial and temporal scales 
at which observations are made. Scale of observation 
has two parts: extent and grain. 


Sensitivity Analysis A process in which model parame- 
ters or other factors are varied in a controlled fashion 
(see Conroy and Moore, Chapter 16). 


Sink Populations In a landscape, a population or site 
that attracts colonists, while not supplying migrants to 
other sites or populations (Pickett and Rogers 1997). 


Site An area of uniform physical and biological prop- 
erties and management status (contrast with Study 
Area). 


Source Population In a landscape, a population or a 
site that supplies colonists to other patches (Pickett 
and Rogers 1997). 


Study Area An arbitrary spatial extent chosen by the 
investigator within which to conduct a study (contrast 
with Site and Scale). 

Territory The spatial area defended (actively or pas- 
sively) by an animal or group of animals. 

Viability Strictly, the ability to live or grow. In conser- 
vation biology, the probability of survival of a popula- 
tion for an extended period of time. 
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Linking Populations and Habitats: 
Where Have We Been? Where Are We Going? 


Dean F. Stauffer 


A s we consider the state-of-the-art of modeling an- 
imal-habitat relationships, it is useful to consider 
the work of those who have come before. Understand- 
ing the historical context of our current approaches to 
modeling and analysis can help to provide a sense of 
“place” in the discipline, and also, if we are attentive, 
may help us to avoid errors and unproductive research 
avenues identified by those who have explored the 
limits of the field in the past. 

As a discipline, the arena of quantitative ecology 
and, specifically, modeling wildlife-habitat relation- 
ships, is relatively young, yet much has been accom- 
plished in the past five decades. My goal here is to 
provide an overview of how our approach to analyz- 
ing wildlife-habitat relationships has evolved over the 
past fifty years. I will consider changes in the use of 
computers, statistics, and philosophical trends over 
this period. Clearly, those topics I address reflect my 
experiences and biases and may not cover all topics 
that others may consider important. The space avail- 
able does not allow complete coverage of all relevant 
and important literature; thorough historical reviews 
have been provided by Karr (1980) and Block and 
Brennan (1993). 

Several general stages have been identified in the 
development of natural history disciplines. The cata- 
loging stage came first, perhaps beginning with Aristo- 
tle (Block and Brennan 1993). This stage evolved into 


a natural history era during the mid-1800s. Through 
the first part of this century, we can note a dramatic 
increase in natural history work on a variety of 
species. The mid-1950s might be considered the 
dawning of the quantitative ecology era. It is the pe- 
riod from the 1950s to present that I wish to address. 

For the most part, this overview considers the de- 
velopment of computing and statistical methods and 
how they have influenced the analysis of wildlife-habi- 
tat relationships. I conducted, with the assistance of 
undergraduate students at Virginia Polytechnic Insti- 
tute and State University (Virginia Tech), a review of 
415 habitat papers published in the Journal of 
Wildlife (JWM) for the period 
1965—1996. For each paper, we noted which statistical 
methods were used to analyze wildlife habitat. I sum- 
marized results for decades and half-decades: 
1965-1969, 1970-1979, 1980-1989, and 1990-1996 
to provide an overview of how use of statistical proce- 


Management 


dures has changed over time. Clearly, this is not a 
complete coverage of the field (N = 1, 0 df) but I as- 
sume that the trends in JWM reflect general trends in 
analytic approaches, even though there will be some 
variation among specific disciplines. For the ensuing 
discussion, I consider development of approaches dur- 
ing four general eras. Within each, I will consider the 
state of computer resources, statistical methods used, 
and major analytic and philosophical trends during 
the period. 
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Where Have We Been? 


The Qualitative-Quantitative Transition: 
The 1950s and 1960s 


In the mid-1950s, we can see the beginning of the 
move from qualitative natural history research to 
work that was more quantitatively focused. During 
this time, Hutchinson (1957) presented the notion of 
an animal’s niche being an n-dimensional hypervol- 
ume. This concept of species being arrayed along mul- 
tiple environmental gradients influenced how many in 
the future would look at wildlife-habitat relationships. 

At the same time, MacArthur (1958) published a 
rigorous and quantitative, though not highly statisti- 
cal, analysis of niche relationships of five warbler 
species. Not long after, MacArthur and MacArthur 
(1961) presented their initial work relating bird 
species diversity (BSD) to foliage height diversity 
(FHD) and introduced the use of H’ as a measure of 
diversity. His regression equation of BSD on FHD 
might be considered one of the first wildlife-habitat 
models. 

Even as some researchers were becoming more 
quantitative in their approach to habitat analysis, oth- 
ers were wishing for a consistent way to evaluate habi- 
tat. Hanson and Miller (1961:75) stated, “The work 
of game managers would be aided if they could readily 
identify some attribute of cover that permits rapid es- 
timation of carrying capacity for bobwhite.” They 
clearly were seeking a wildlife-habitat model that 
could be applied in a management context. 

Although we see the stirrings of quantitative ap- 
proaches in the 1960s, of the thirty-four JWM papers 
reviewed for the latter part of this period, only nine- 
teen used statistics. During this time, much habitat 
analysis was qualitative and descriptive, relying on 
terms such as “very dense, dense, open, very open” 
(Bendell and Elliot 1966) with little accompanying 
data to tell us just how dense “dense” was. When sta- 
tistical analyses were used, the dominant methods ap- 
plied were t-test, ANOVA, Chi-square, correlation, 
and simple linear regression (Fig. 3.1). Access to com- 
puters was limited and the majority of calculations 
were carried out by hand or with rotary calculators 
and slide rules; as a result, relatively simple statistics 
were typically used, and use of computers warranted a 


special note. For example, Klebenow (1969) reported 
the use of an IBM 1620 and noted that his analysis 
was limited to twenty variables because of computing 
limitations. 

As the 1960s ended, we saw that a theoretical 
framework for habitat and niche analysis was 
evolving and that statistics were beginning to be used, 
although their use was limited by computational 
capabilities. 


The Era of Multivariate Muddles: The 1970s 


The 1970s represent a decade during which there 
was tremendous growth in computer availability and 
power, with an accompanying increase in the use of 
computationally intensive statistical procedures. 
During this time, there was a substantial increase in 
the use of statistical analysis of habitat in JWM pa- 
pers (Fig. 3.1). Most noticeable is the greater number 
of papers using regression approaches, particularly 
multiple regression after 1975. I attribute this to the 
availability of statistical computer software that 
would determine the inverse of the X’X matrix, 
which is an unpleasant undertaking, at the least, 
with a rotary calculator, or that previously would 
have required custom computer programs. During 
this time, we also saw an increase in the application 
of nonparametric statistics. However, in JWM the 
use of multivariate statistics was relatively low com- 
pared to its published use in other journals, such as 
Ecology. 

This growth in statistical applications parallels the 
continued growth of computing power. Whereas at 
the beginning of the decade many researchers were an- 
alyzing their data by hand, the majority of analysis 
was done with computers by 1979. During this time, 
we saw the development of comprehensive statistical 
packages such as BMDP, SAS, and SPSS. We were no 
longer dependent on custom programs developed at 
individual computing centers for analysis of data. Also 
during this time came the ability to store data on 9- 
track tapes or card decks, making it more portable, 
but it also meant we were developing the means to 
make mistakes more quickly than ever before. 

The most dominant analytical feature of the 1970s 
was the growth in application of multivariate methods 
such as discriminant function analysis, principal com- 
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Figure 3.1. Percentage of reviewed papers from the Journal of Wildlife Management that used various statistical techniques at dif- 
ferent time periods. Sample sizes for each time period were 1965-1969, 34; 1970-1979, 76; 1980-1989, 195; and 1990-1996, 
110. Statistical methods summarized are t-test, analysis of variance (ANOVA), Chi-square, G-test, simple correlation (Corr), simple 
linear regression (Simple), multiple linear regression (Multiple), multiple response permutation procedures (MRPP), Wilcoxon rank- 
sum test (WILCOXON), Kruskal-Wallis test (K-W), principal components analysis (PCA), discriminant function analysis (DFA), log-lin- 


ear analysis (Log-lin), and logistic regression (Log Reg). 


ponents analysis, and factor analysis. The first 
published applications of multivariate statistics to 
wildlife habitat I am aware of actually come from the 
late 1960s: Cody's (1968) analysis of grassland bird 
habitat and Klebenow's (1969) work on greater sage- 
grouse (Centrocercus urophasianus). However, I be- 
lieve the seminal work was that of James (1971). She 
introduced to us the idea of niche-gestalt as a descrip- 
tion of a bird's habitat, and operationalized the multi- 
dimensional niche of Hutchinson (1957) through her 
analysis. This paper had a great influence on me as a 
graduate student and how I looked at habitat, and I 
suspect many others also were energized and moti- 
vated by her work. Subsequently, throughout the 
1970s multivariate analyses were applied to a variety 
of taxa in a diversity of habitats (e.g., Martinka 1972; 


Anderson and Shugart 1974; Dueser and Shugart 
1978). 

A potential problem with the application of 
multivariate statistics, however, was that anyone capa- 
ble of entering their data into a matrix that could be 
processed by a statistical program could conduct a 
multivariate analysis. It was not uncommon during 
this time to see applications of new and unfamiliar 
multivariate procedures that failed to address assump- 
tions concerning the use of such techniques. As a re- 
sult, many of the analyses carried out were of limited 
value. 

Another aspect of wildlife habitat research that was 
addressed during this time was the question of when a 
resource, such as a cover type, is preferred. Neu et al. 
(1974) presented an approach using chi-square to 
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assess habitat preference in animals. We terrestrial 
workers had lagged somewhat behind our aquatic col- 
leagues in addressing this question (Ivlev 1961). This 
approach provided a template for quantitative analy- 
sis of habitat-use-availability data and meshed nicely 
with the increasing number of radiotelemetry studies. 
However, as with the multivariate analyses, assump- 
tions about the method being used were often not 
considered. à 

As researchers were forging ahead with sophisti- 
cated statistical techniques and powerful (or so it 
seemed at the time) computers to seek the truth under- 
lying how animals were relating to their habitats, 
managers were seeking ways in which to use such in- 
formation to address their needs. The National Envi- 
ronmental Policy Act of 1969 (U.S. Laws, Statutes, 
etc., Public Law 91-190) required that the impacts of 
activities using federal funds be described prior to ac- 
tion. This included impacts on wildlife habitat. As a 
result, approaches were devised to allow the assess- 
ment of these impacts. Prevalent among the methods 
proposed were the Habitat Evaluation Procedures 
(HEP) developed by the U.S. Fish and Wildlife Service 
(USFWS 1980, 1981a) that made use of habitat suit- 
ability index (HSI) models and the wildlife-fish habitat 
relationships models of the U.S. Forest Service 
(Thomas 1982). Pattern recognition (PATREC) mod- 
els, based on Bayesian statistics, also were adapted 
during this time to address natural resource problems 
(Williams et al. 1978). As these approaches were de- 
veloped, the groundwork was likewise laid for poten- 
tial conflicts between researchers, who were seeking 
“truth,” and the managers, who were seeking tools to 
help them meet their resource management needs. 

During this era, we see relatively little considera- 
tion of the precision or accuracy of the models that 
were being developed. A statistically significant (e.g., 
P < 0.05) result for a regression or multivariate analy- 
sis usually was taken to mean that the model was 
“good,” and the models were accepted somewhat un- 
critically, perhaps because many researchers did not 
fully understand the assumptions underlying the pro- 
cedures. Classification accuracy of models was typi- 
cally assessed by how well the model predicted the 
data used to build the model, which, as we would ex- 
pect, indicated the models predicted well. 


Thus, in the arena of habitat analysis, the decade 
closed with many on the multivariate bandwagon, 
convinced that with adequate computer power and so- 
phisticated statistics there was no problem that could 
not be overcome. However, even as we forged ahead, 
there were some who reevaluated what we had done 
and offered cautions. For example, Roth (1976) found 
that in some habitats, the BSD-FHD link of Mac- 
Arthur (MacArthur and MacArthur 1961) didn't 
apply in all cases and that horizontal diversity within 
a habitat may be as important, or more so, than verti- 
cal diversity in influencing bird diversity. At the close 
of the decade, Karr (1980) cautioned care in the 
application of multivariate statistics, noting that we 
should be aware of biologically—and not necessarily 
statistically—significant relationships in our data. 


Manifest Destiny and Gadflies: The 1980s 


As we came into the 1980s, the enthusiasm of the pre- 
vious decade continued, and it seemed there was noth- 
ing we couldn't accomplish with more computing and 
statistical power. This period saw the advent of the 
personal computer for data analysis and, by the end of 
the decade, only $5,000 would purchase a 386-20 
megahertz machine with at least a 200-megabyte hard 
drive and 4 megabytes of RAM. In JWM, we saw dur- 
ing this time a continued increase in the use of various 
statistical methods to analyze habitat relationships 
(Fig. 3.1). There was an increase and then a decline in 
multivariate methods during the decade and a general 
decline in the use of regressions. However, there also 
was an increase in the use of nonparametric statistics, 
which likely represents an increasing awareness of the 
assumptions associated with parametric methods and 
the desire to not violate them (but see Johnson 1995). 
For the first time, the use of log-linear methods, which 
are computer intensive, also appears (Fig. 3.1). Multi- 
variate methods continued to see substantial use in 
this decade, and a symposium dedicated to multivari- 
ate analysis was held in 1980 (Capen 1981). 

Progress was made in the arena of preference as- 
sessment and Johnson (1980) provided a new tech- 
nique, based on ranks, to complement and at times re- 
place the chi-square approach of Neu et al. (1974). A 
little later, Alldredge and Ratti (1986) presented an 
evaluation of the behavior of several preference assess- 


ment approaches applied to a variety of data sets with 
known attributes. They demonstrated the low power 
of tests done with few animals and that the number of 
animals, observations per animal, and number of 
habitat types all will affect interpretation of prefer- 
ence. At the end of the decade, Thomas and Taylor 
(1990) presented a very clear description of three basic 
study designs that lead to preference assessment analy- 
ses. Their paper has helped to remove confusion that 
some researchers may have had concerning what their 
exact sampling unit was. 

As the habitat analysis bandwagon rolled on 
through the 1980s, a notable addition was the analy- 
sis of landscapes. This period saw the emergence of 
landscape ecology as a discipline that came into its 
own during this time (Forman and Godron 1986). 
Analysis of large-area data was greatly facilitated by 
the increased availability of remotely sensed data (e.g., 
Palmeirim 1988), geographic information system 
(GIS) software, and powerful computers capable of 
handling large data sets. These technologies allowed 
us to measure habitat efficiently at new scales and to 
relate animal occurrences to these large-scale patterns 
(e.g., Lyon et al. 1987). As this new discipline devel- 
oped, a new set of habitat metrics was generated as we 
sought to understand the relationship between 
patches, edges, and corridors. Unfortunately, it ap- 
pears to be easier to invent a *new" measure of a 
landscape than it is to clearly explain exactly what 
that measure means. Although numerous new metrics 
have been suggested, many are redundant and difficult 
to understand or to relate in a meaningful way to ani- 
mal populations (e.g., as the fractal dimension of the 
landscape changes, what does it really mean?). 

Perhaps the distilling event of the decade was the 
convening in 1984 of researchers working on modeling 
wildlife-habitat relationships at the Wildlife 2000 con- 
ference. This resulted in the publication of Wildlife 
2000: Modeling Habitat Relationships of Terrestrial 
Vertebrates (Verner et al. 1986b) and serves as a rela- 
tively complete summary of modeling approaches, is- 
sues, and concerns as they were at that time. A partic- 
ularly useful aspect of the published proceedings was 
the presentation of managers' and researchers' views of 
various issues within modeling. Through these essays, 
it was recognized that these two groups have different 


3. Linking Populations and Habitats S. 


needs and agendas that should be considered, and their 
sometimes-conflicting views gained recognition. 

Up to this time, a large number of models of vari- 
ous types had been developed, but seldom were they 
actually tested against independent data sets to evalu- 
ate their accuracy. The 1980s saw many tests con- 
ducted to assess the accuracy of models representing 
wildlife-habitat relationships. A number of the papers 
presented at the Wildlife 2000 conference addressed 
model accuracy (Verner et al. 1986b). Sweeney and 
Dijak (1985) tested a PATREC model for ovenbirds 
(Seiurus aurocapillus) and found it predicted pres- 
ence/absence on independent sites with 94 percent ac- 
curacy. Other representative model tests for this pe- 
riod include Morrison et al.’s (1987) evaluation of 
regression models and Lancia and Adams’ (1985) tests 
of habitat suitability models. Morrison et al. (1987) 
concluded that their regression models were useful for 
predicting presence/absence but lacked the precision 
to adequately project bird abundance, whereas Lancia 
and Adams (1985) concluded that HSI models pre- 
dicted adequately but pointed out that scale is an im- 
portant consideration when testing models. Also 
during this time, O'Neill et al. (19882) provided infor- 
mation on how to adjust HSI models to adapt to local 
conditions and in doing so helped to bridge the re- 
searcher-manager gap. 

As researchers forged ahead with their efforts to 
conduct studies on wildlife habitat using the myriad of 
new techniques available, several works were 
published that considered the status of the field. 
Romesburg (1981) issued a challenge to the wildlife 
community to “shape up.” He provided a thoughtful 
and, perhaps to some, annoying essay that pointed out 
that a substantial amount of the research done in 
wildlife amounts to a whole lot of nothing much, 
based on what he termed “general purpose data.” He 
encouraged the use of the hypothetico-deductive (HD) 
method whereby clear hypotheses based on the best 
understanding of the system being studied would be 
postulated. Once the hypotheses were established, 
then a study could be designed to test the hypotheses. 
I believe a solid contribution of his efforts was to re- 
mind us that it is important to think before we go into 
a study; we can't simply assume that with enough 
data, the computer will sort it out. Romesburg's 
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(1981) paper led to a particular emphasis for using the 
HD method in the mid- to late-1980s for papers in 
JWM. However, this may have led to some high- 
quality work not being published. When we have a 
base of information to work from then the HD 
method may work well, but in many cases we still 
need to establish the pattern (i.e., descriptive research) 
for a system before we can develop a hypothesis to ex- 
plain the pattern (e.g., Wiens 1989b). 

Much of the modeling done (and still being done 
today) has used some measure of population density 
or abundance as the response variable that is related 
to some suite of habitat characteristics. Van Horne 
(1983) brought to our attention that at times density 
may be a misleading indication of habitat quality and 
that demographic information on survival and fecun- 
dity will aid greatly in establishing the quality of a 
particular habitat. She describes cases when density is 
a poor indicator, and her work helped many to think 
more clearly about what they were doing when model- 
ing wildlife habitat. One other influential work in the 
early 1980s was that by Hurlbert (1984). He clearly 
pointed out the importance of understanding the con- 
cept of the sampling or experimental unit and how 
pseudoreplication can invalidate many of our conclu- 
sions drawn from statistical analyses. I suspect that 
most workers have been guilty of pseudoreplication at 
some time in their career. For example, I was able to 
miraculously generate 1,349 data points for a regres- 
sion analysis from twenty-eight study sites (Stauffer 
and Best 1980, 1986), yet this point passed by two ed- 
itors and numerous referees. Pseudoreplication still re- 
mains a problem today and is something we all need 
to guard against. 

Near the end of the decade, Rexstad et al. (1988) 
took multivariate statistics to task. They analyzed a 
data set of random data composed of numbers from 
sources such as phone book numbers, liquor prices, 
produce prices, and mean temperatures. They were 
able to demonstrate “significance” within the analyses 
and provided a useful warning concerning the appar- 
ently arbitrary nature of interpretation of some multi- 
variate techniques. Bunnell (1989) presented an enjoy- 
able discourse on the need to bring together 
researchers (cerebral anarchists in his terminology) 
and managers (alchemists) when modeling so that 


there is clear communication concerning goals and ob- 
jectives, and he provided suggestions on how this 
might be done. 

As we approached the next decade, we saw that the 
tremendous growth in quantitative approaches contin- 
ued to address issues at different scales and took ad- 
vantage of exponentially increasing computing power. 
However, during this time we were also treated to dis- 
courses that reflected on our work and provided guid- 
ance to ensure we correctly approached wildlife habi- 
tat analysis. 


Refinement and Moderation: The 1990s 


The theme and patterns of habitat analysis established 
in the 1980s continued into the 1990s. Computer 
power has continued to increase, and we now have on 
our desks computers more powerful than the main- 
frame computers some of us used for our graduate 
work. All major statistical analysis packages are now 
available for personal computers, and the concept of 
the *Computer Center" is foreign to most of our grad- 
uate students. 

In this decade, we can see continued refinement of 
statistical approaches. Particularly noticeable is the 
advent of logistic regression and permutation proce- 
dures (Mielke 1986). It seems that during the last ten 
years, researchers have paid more attention to as- 
sumptions associated with various statistical proce- 
dures, and, as a result, application of various tests and 
analyses are generally appropriate. One trend that car- 
ried over from the 1980s is the use of power analysis. 
Toft and Shea (1983) recommended the use of power 
analysis to evaluate the probability of nonsignificant 
tests actually detecting a difference. As we entered the 
1990s, we began to see power analysis show up in 
some papers. However, Steidl et al. (1997) credibly 
pointed out the fallacies associated with post hoc 
power analyses. They emphasized clearly that the ap- 
propriate use of power analysis is for study design 
prior to data collection; it should not be used as a tool 
to justify or discuss nonsignificant results. 

The 1990s also saw considerable attention given to 
hypothesis testing. As we left the 1980s, the HD 
method was the desired approach, and we encouraged 
our graduate students to establish clear, testable 
hypotheses prior to conducting their study. As this 


statistical rigor was being emphasized for inferential 
statistics, others were suggesting alternatives. Reck- 
how (1990), among others, encouraged the use of 
Bayesian statistics to evaluate research results rather 
than the more traditional parametric statistics. John- 
son (1999a) developed a case for the Bayesian ap- 
proach and pointed out the common mistakes and as- 
sumptions that are made in traditional hypothesis 
testing. He encourages the use of evaluating data with 
point estimates and confidence intervals. Johnson 
(1995) has additionally pointed out that in many cases 
nonparametric statistics are inappropriately applied 
and that in most cases, parametric tests, such as a t- 
test, are robust and perform well. James and McCul- 
loch (1990) provided a thorough overview of the use 
of multivariate statistics in ecology and found the ap- 
plications by researchers wanting in many cases. They 
indicated clearly the situations for each method is ap- 
propriate. These works and others have helped to re- 
focus attention on biological, rather than statistical, 
results. 

We also saw during the 1990s further refinement of 
methods used at the landscape level, which is closely 
tied to technology development. At the landscape 
scale, a major new approach is the development of 
spatially explicit models (e.g., Pulliam et al. 1992, 
Rickers et al. 1995) that tie population dynamics to 
specific locations or patches on the ground. Turner et 
al. (1995) have pointed out how such models can help 
to meet many of the needs of natural resource man- 
agers who have responsibility to manage large tracts 
of land. Our ability to generate a plethora of land- 
scape metrics increased during this decade (McGarigal 
and Marks 1995), but our ability to understand the 
nature of these measures still lags behind our capacity 
to create new indices. Distributional properties of 
these measures are generally not well known, are usu- 
ally nonlinear, and are sometimes multimodal as land- 
scapes change (Trani and Giles 1999). Hence, as we 
use such measures as predictor variables in models, 
it may not be clear just what the resulting model 
represents. 

Determining habitat preference assessment has con- 
tinued to be an important component of many studies, 
and a relatively new approach, compositional analy- 
sis, has come to be the method of choice to evaluate 
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resource selection (Aebischer et al. 1993). This ap- 
proach addresses perceived problems with approaches 
such as Neu et al. (1974) and fits well within the 
framework of study types outlined by Thomas and 
Taylor (1990). Additionally, a wide variety of ap- 
proaches was synthesized and presented by Manly et 
al.- ($993). 

We continued to develop models from a diversity 
of habitat analyses during this period, but the ten- 
dency to develop, rather than to test, models contin- 
ued (Morrison et al. 1992). During this period, how- 
ever, assessing accuracy of a variety of models was 
undertaken (e.g., Timothy and Stauffer 1991; Flather 
and King 1992; Block et al. 1994; Heppell et al. 
1994; Adamus 1995; Nadeau et al. 1995). Generally, 
these and other studies indicated we can achieve 
moderate to good accuracy in predicting presence/ 
absence, but abundance is more difficult to precisely 
predict. 

Methods to model habitat quality in a management 
context continued. Particularly notable was the devel- 
opment of the hydrogeomorphic method (HGM) 
(Smith et al. 1995), which is designed to evaluate wet- 
land quality and complements HEP. Roloff and Ker- 
nohan (1999) summarized seventeen studies that 
tested fifty-eight HSI models; upon finding that the 
majority of the tests were deficient, they provided al- 
ternative guidelines for effective and valid model tests. 
Van Horne and Wiens (1991) reviewed forest bird HSI 
models and assessed ways to develop multispecies HSI 
models. They provided suggestions for incorporating 
large-area variables into these models and emphasized 
the need for testing model components and the final 
product. 

Even as we developed more management models, 
the apparent gulf between researchers and managers 
had not narrowed appreciably (Turner et al. 1995) but 
more efforts were being made to develop models that 
were based on data typically available to resource 
managers (e.g., Dettmers and Bart 1999; Penhollow 
and Stauffer 2000). Starfield (1997) very nicely dis- 
cussed the difference between models designed to dis- 
cover "truth" and those to be used as a problem- 
solving tool. He developed a strong argument for bet- 
ter communication between decision-makers and 
scientists. 
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Where Are We Going? 


Clearly, we've come a long way since the 1950s with 
our approaches to analyzing and modeling wildlife- 
habitat relationships. The development of technolog- 
ical aids has greatly influenced how we have been 
able to analyze data at multiple scales of extent and 
grain. As computer power continues to increase 
along with additional development of analytical 
techniques, we will be able to develop increasingly 
complex models. I anticipate that we will continue to 
see greater integration of different scales (extent and 
grain) of habitat information with a greater reliance 
on GIS and remotely sensed data. Some promising 
new approaches include fuzzy logic and neural net- 
works (Lusk et al., Chapter 28), spatially explicit 
models (Raphael and Holthausen, Chapter 62), and 
CART regression (Clark et al. 1999); how well these 
will work remains to be seen. Increasing computing 
power should also allow continued use of statistical 
methods such as permutation tests and Bayesian ap- 
proaches as appropriate. 

The pattern of the past seems to be that as a new 
method or technique appears, it generates a great deal 
of excitement, researchers quickly apply it to their 
work, and, for a period of time, it is discussed fre- 
quently in the literature and at professional meetings. 
It is only later, as we come to better understand the 
limitations of and assumptions behind the new 
method or technique, that it is applied less frequently, 
but perhaps more appropriately. If the method proves 
to be truly useful for some situations, even as it falls 
from grace as the method du jour, it will continue to 
be used in those situations where it works well. I sus- 
pect that, as new approaches continue to be devel- 
oped, we will see this pattern repeated. 

As we try to bridge the gap between the “real 
world" needs of managers and the more esoteric do- 
main of researchers at universities and other institu- 
tions, I am optimistic that progress will be made. As 
our natural resources continue to be impacted, it is 
critical that we have the means at our disposal for 
wise stewardship. One aspect of this will include the 
use of tools such as models that clearly allow the eval- 
uation of potential impacts and management alterna- 
tives. Resource iaanagers will continue to need accu- 


rate models that can be readily applied over large 
areas to help them address management challenges. I 
believe that much of the past distrust managers had 
for researchers resulted from the fact that many of the 
models developed were for small areas and required 
inputs of habitat data not typically present in manage- 
ment databases. As researchers continue to develop 
models at landscape scales that are more appropriate 
to management needs, I am optimistic that the gap be- 
tween managers and researchers will narrow. We must 
communicate better so that researchers can develop 
the methods managers need to meet their immediate 
needs; in an adaptive management context, doing so 
may prove fruitful (e.g., Conroy and Moore, Chapter 
16), but it will require effort on both sides of the 
fence. 

A concern for the present and future is the informa- 
tion glut. New journals continue to be published, 
many now in electronic format, and it has become dif- 
ficult to keep up with the relevant work within any 
subject area. The result might be a duplication of ef- 
fort and repetition of research rather than research 
built upon what has come before. For example, two 
recent papers (Hepinstall and Sader 1997; Tucker et 
al. 1997) presented the use of Bayesian statistics to 
model suitability of landscapes for birds. This would 
appear to be an innovative approach to such a task. 
However, a quarter of a century ago, Williams et al. 
(1978) presented the development of PATREC mod- 
els, based on Bayes’ theorem, that can be used to eval- 
uate habitat quality. Reference to the work on 
PATREC models in the 1970s does not appear in these 
recent papers. Here is a case where the approach has 
been reinvented. (I do not intend to detract from the 
solid quality and useful results of these papers; rather, 
I use this to point out how difficult it is to keep up 
with everything that has been done.) Symposia and 
works such as this volume contribute greatly to the 
sharing of information and to allowing all to stay up 
with current work. Such meetings should be encour- 
aged in the future. 

As our computing power increases, our models 
can (but shouldn't necessarily) become more com- 
plex. A model may be considered to be a hypothesis 
based on the data used for its development. It is im- 
portant that models (i.e., hypotheses) continue to be 


tested and evaluated. I fear at times that model com- 
plexity is equated to model quality and that we may 
have exceedingly high expectations for our model 
outputs. The data used to develop models (both re- 
sponse variables and predictors) are measured with 
error (Maurer, Chapter 9), and it is not likely that we 
will ever create models that predict with great preci- 
sion (e.g., number of animals per unit area within 
0.1 individuals per hectare) with any degree of relia- 
bility. A variety of factors influence the detectability 
of species in field studies (Ralph and Scott 1981; 
Thompson et al. 1998), yet most modeling efforts 
have usually treated the presence/absence or abun- 
dance of the species being modeled as a known en- 
tity. Doing so is likely to hinder model accuracy. We 
must recognize that we are often predicting the prob- 
ability of detecting a species, given a set of habitat 
conditions, and not necessarily its true abundance or 
presence. Recently, species detectability has received 
consideration in modeling efforts (Conroy and Noon 
1996, Boone and Krohn 1999, Schaefer and Krohn, 
Chapter 36); improving the accuracy of our models 
should prove to be a fruitful area of research in the 
future. 

We will continue to improve our ability to more 
precisely and accurately measure habitat features at 
large and small scales and to estimate species abun- 
dance or presence with greater accuracy. I believe, 
however, our ability to develop models that ade- 
quately predict species abundance or density will be 
limited because of the inherent noise in the systems 
within which we work. In many (or most) cases, we 
may well be satisfied with being able to accurately pre- 
dict presence or absence, or perhaps general categories 
such as none, low, medium, and high, with some de- 
gree of consistency. For example, Jorgensen and De- 
marais (1999) found it difficult to develop models that 
predicted exact species richness of small mammals, 
but they were able to reasonably predict habitats with 
high and low species richness, which may be adequate 
to meet management needs. The simplest model that 
meets a stated goal is likely to be the best one. Al- 
though such models may not provide the “truth” that 
assures us another publication in a rigorous journal, 
they may provide the problem-solving capability 
needed by a manager (Starfield 1997). 
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The more-sophisticated statistics and faster com- 
puters now readily available allow us to make mis- 
takes more quickly than ever before. A firm founda- 
tion in the basics of experimental and survey design is 
necessary; we should always be aware of what the 
sampling/experimental unit is for the problem at hand 
(Hurlbert 1984). It is critical that all researchers have 
a solid understanding of their objectives and of the 
tests or procedures appropriate for their data. Given 
the ease with which most of the current analytical pro- 
grams can be applied, it is quite simple to conduct 
nonsensical analyses and, unless one is paying close 
attention, to believe in the results. No amount of sta- 
tistical wizardry can compensate for a poor study de- 
sign or inadequate data. Thus, it is incumbent upon 
those who have responsibility for training the upcom- 
ing generations of researchers and managers to ensure 
they are well grounded in these basics. I believe it is 
particularly important that we teach our students to 
think conceptually. Serious reflection on the what, 
why, and how of data analyses, before touching a 
computer, should be required of all students (and es- 
tablished researchers). Doing so should help provide 
students with a sense of how their work adds to previ- 
ous research and help ensure appropriate application 
of methods. 

I am optimistic about the future of modeling the re- 
lationships between animals and their habitats. We 
have learned much over the years and have new tech- 
nologies and methods available that allow us to ad- 
dress needs of single species and communities at 
multiple scales. To the extent that we can learn from 
our mistakes, build upon our successes, and temper 
our expectations with pragmatism, we should be able 
to continue to develop the ability to analyze and 
model animal responses to habitat in contexts valu- 
able to both researchers and managers. 


Acknowledgments 


I appreciate the assistance of C. Darby with the review 
of JWM papers. This manuscript was improved by the 
review and suggestions of R. Oderwald, M. Reynolds, 
J. Berkson, D. Whitaker, T. Fearer, K. Mattson, J. M. 
Scott, and an anonymous reviewer; all errors remain 
my own. 


mE p ee 


Raga s patr petet 


site? erntacaut: OT S sai 


"opi a VG shiek vi " 
DS T NS í Avr È NU ys 
SI Aged red? Laois 


fe " A ut den D 


34 e 
gu sos EF bep Tus 
p H E Ld ded 
X os 


DEC LE ay | a 
à ERIS if ; 
H U 


E^ 


po 
"equi 1 vi im 


5 Sie BY ero Toni do ica 
ÉSTE P 
b - B EYE: 


i "m wrt big qi 
Garupe vd bined semen 
enc Bay Thulgl dio 
BAIS ee ihe ne 
ager’, ita We gol-ve. 


Fr 
E adii MEE EAT 


Jem 


iai ul 


E — — e: tQ d utu. deere iris is 


UM ABT ID eget Ga disse Tito Sida, 
Exe aam Fes cs EAR IE VINE. Po toam Belli o s 


EES (e! a eee. yx 
fale be eyed: sam on L1 NE 
sismo: — Situ vawe ji SUN Her i 
l TAE ARRI EE e 2 rog otl enin 
Si bi be ere qe ai SR 
Sd emp o "TN 
porc iw rm en so upto 


e rng amijen L 


[NOIL Se T T 3 
LL era 
Jeep" 


FM qe 


hg "o eit "3 


deme Pam 2 TOTO l ^ 
iesu ESU a meae my digls te ent 


eem lu 
Missa ort xeimyitonse nhs ste aaron: 
8er hte tem, Apes one! Mente 
w, anm lo mite gares o5 cemere 
bene: at d wide Fer saree 
£n "ni heehee eens ve^ atr sepali 
«tient ga-ttdea o4 1! Tier 
AMedaenepr nd isch atio scettr i 
^ (et. aaaea 1 doge 
Tropes peel ene 
Malis $e m mist: 
EE nonet — SN 
mutas «2 iiia " 
uos 
AEE mod T 
«il NET ro eot 
&Acoc ren 
[X er ^ igit sen s 
pete eaim — i. 


peto te Ll A tado. 


ites rdi E 
Mitis 


T" P 


> xvid " ad 


= 


——A——M 


-—À 


— m —————— 


RB. ]/4i 
Ens oL 
CHEER Gas, us 


CHAPTER 


4 


Approaches to Habitat Modeling: 
The Tensions between Pattern and Process 
and between Specificity and Generality 


Beatrice Van Horne 


E ecology, as in other sciences, theoretical models 
summarize important processes and relationships 
without claiming to predict occurrences in specific sit- 
uations. Applied habitat models, on the other hand, 
make quantitative predictions that can be evaluated 
with empirical data. Because models make verifiable 
predictions, applied habitat modeling is an activity 
most ecologists approach with some trepidation. Most 
agree that quantitative models provide an essential op- 
portunity to make ideas about habitat relationships of 
wildlife species explicit, quantifiable, and testable. 
Such models foster communication among researchers 
and managers by minimizing misunderstandings asso- 
ciated with more qualitative descriptions of wildlife- 
habitat relationships. Of course, because any model is 
a simplification of a complex biological system, it can- 
not be perfectly predictive. Hence, a model may be 
thought of as a hypothesis that we know is false for 
any given biological system, but we may choose to 
keep because it is useful (Box 1976). The fear ecolo- 
gists feel in constructing models results from knowing 
that the model is open to criticism because it is ulti- 
mately false. Any modeling implementation will in- 
clude unrealistic biological assumptions, not meet sta- 
tistical assumptions, omit causal relationships, and/or 
fail to meet modeling objectives. We handle our mod- 
els with skepticism because we are aware of these 
foibles and know that the modeling approach we have 


used is likely to be heavily criticized and replaced with 
new approaches in the future. Unfortunately, misun- 
derstanding the power of a given approach for making 
certain types of predictions or answering some types 
of questions may lead to misuse of models or mis- 
placed criticism. Sometimes such criticisms leads to 
the premature abandonment of promising approaches. 
In this brief discussion of habitat models, my objec- 
tive is to clarify the strengths, weaknesses, and appro- 
priate uses of different types of models. It is my hope 
that, if these are understood more clearly, we may be 
less likely to be blown by the winds of fashion in 
choosing modeling approaches and more likely to 
make real progress in constructing appropriate and 
useful models. Block and Brennan (1993) have dis- 
cussed historical approaches to studying and modeling 
bird-habitat relationships, and Rosenzweig (1989) has 
summarized experimental approaches to developing 
ecological models of habitat selection by small mam- 
mals. In my discussion, I will assume Block and Bren- 
nan's (1993) definition of habitat as the subset of 
physical environmental factors that a species requires 
for its survival and reproduction. It follows from this 
definition that individuals in higher-quality habitat 
will have greater survival and reproduction, and hence 
fitness, than individuals in poorer-quality habitat. 
Modeling techniques are closely tied to modeling 
objectives. Just as there is no perfect personal automo- 
bile, there is no single *best" modeling approach. 
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When choosing a personal automobile, customers in- 
corporate tradeoffs between size and gas mileage, 
clearance and stability, or smoothness of the ride and 
precision of handling according to their larger objec- 
tives such as minimizing cost, maximizing off-road ca- 
pabilities, or impressing the neighbors. Similarly, any 
modeling approach involves a trade-off. Adopting an 
appropriate modeling strategy involves matching man- 
agement objectives with model capabilities. Re- 
searchers need to assess the generality they wish to 
achieve with their modeling effort and balance that 
against the power to use the model to make specific 
predictions (Levins 1966). Many modeling efforts rep- 
resent an attempt to quantify imperfect knowledge. 
Models may be useful in determining the relative value 
of management alternatives but not be able to provide 
accurate predictions of the effects of these alternatives. 


Generality Versus Specificity in Modeling 


Models that make specific predictions are likely to be 
limited in their application because of variability in 
both time (Best and Stauffer 1986) and space. Because 
even the most complete model cannot include all bio- 
logically important causal relationships for a given lo- 
cation, one cannot expect accurate predictions over 
time. Such models may describe occurrences in the 
past accurately but are not useful for forecasting 
(Morrison et al. 1992). A model that works well for a 
given location, such as one that predicts the number of 
burrowing owls (Athene cunicularia) based on the 
number of prairie dog burrows available, may not be 
useful in another location, such as an area where 
prairie dogs are absent and burrowing owls nest in 
sites other than prairie dog burrows (Haug et al. 
1993). 

Usually, general models are simpler than models 
that aim to make specific predictions. Using fewer 
terms enhances the clarity of such models and in- 
creases their applicability to a broad range of systems. 
General models may either describe broad patterns 
(general empirical models) or be more theoretical than 
specific models (Fig. 4.1). General empirical models 
lose the ability to make specific predictions because 
they are less likely to be directly related to processes 
than are either the specific models (Bissonette 1997) or 
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Figure 4.1. In reality, the influence of habitat on a species is 
determined by the pattern-process relationship. Theoretical, 
specific, and general empirical models incorporate varying 
amounts of pattern and process. 


the theoretical models. Researchers construct theoreti- 
cal models to understand the important relationships 
that govern population numbers. For instance, much 
recent work has been framed in terms of “bottom-up” 
or “top-down” population regulation (see Power 
1992). A theoretical model would describe when and 
how each type of regulation is primary across a wide 
range of populations of a given species or across a 
broad range of species. This could have important im- 
plications for managing a target species. A specific 
model would document the nature of population regu- 
lation for an intensively studied population but would 
have little explanatory power across the species range 
or among species. 


Model Validation and Boundary Conditions 


Because general theoretical or empirical models do not 
make precise predictions, and because specific models 
make predictions only for a target population at a cer- 
tain time (or under certain conditions), habitat models 
cannot be tested with new data sets in the same way 
that a hypothesis can be tested. Thus, the emphasis on 
external validation of models is often misguided. One 
can, however, test the importance of processes thought 
to be important in driving observed habitat relation- 
ships and test for the boundary conditions under 
which these processes may become relatively unimpor- 
tant. Boundary conditions describe the range of a. 
given habitat variable or variables within which the 
model is applicable, or the model “domain.” For 
example, at the northern end of their range in winter, 


belted kingfishers (Ceryle alcyon) feeding on freshwa- 
ter minnows may be limited by the availability of ice- 
free rivers (Kelly and Van Horne 1997), whereas far- 
ther south other factors may determine habitat 
occupancy. Hence, a model describing habitat occu- 
pancy in terms of ice-free rivers would apply only 
when and where lotic waters were subject to freezing. 
Similarly, availability of old-growth forest may deter- 
mine numbers of northern spotted owls (Strix occi- 
dentalis caurina) (Forsman et al. 1996a,b), whereas 
availability of a mixture of forest types influences 
Mexican spotted owls (Strix occidentalis lucida) 
(Ganey and Dick 1995; Gutiérrez et al. 1995). Rather 
than attempting to construct a single model that can 
either be applied across the entire range of a species or 
restricting application of the model to the place in 
which data were collected, it would be more useful to 
broaden sampling or experimental sites to identify 
boundary conditions in forest cover, temperature, 
moisture, and similar factors within which the model 
is applicable. The use of gradients in sampling design 
has been more common among botanists than zoolo- 
gists but is essential for identifying boundary condi- 
tions and thresholds. Austin (Chapter 5) discusses the 
design of gradient-based sampling for modeling rela- 
tionships between fauna and vegetation. 

Such boundary conditions for continuously distrib- 
uted variables can only be identified when gradients 
are investigated as part of the study design. Animal 
ecologists do not usually test boundary conditions be- 
cause most work is done within or among discrete 
habitat types. They may array experimental blocks 
along a gradient of interest, but such sampling may be 
too discontinuous to identify boundary conditions. 
This contrasts with the approach of plant ecologists 
for whom sampling along gradients (and hence, deter- 
mining boundary conditions) is more common (Austin 
1999b). These differences in study design may in part 
result from the different statistical and/or experimen- 
tal approaches that are used by these two groups. An- 
imal ecologists often begin their investigations with 
ANOVA approaches and block experimental designs, 
whereas plant ecologists often begin with ordination. 

It appears that models cannot be universally vali- 
dated because they are too general to make precise 
and testable predictions, apply only within tight and 
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often unknown boundary conditions, or have failed to 
incorporate process and therefore rely on correlations 
that may be spurious. Adopting an appropriate mod- 
eling strategy involves matching management objec- 
tives with a realistic assessment of model capabilities. 
In particular, researchers need to assess the generality 
they wish to achieve with their modeling effort and 
balance that against the power to use the model to 
make specific predictions. The use of alternative mod- 
els in an adaptive framework (Conroy and Moore, 
Chapter 16) can greatly enhance the predictive power 
of models if carried out over a long-term period. 

Too often, modeling approaches are chosen on the 
basis of currently accepted “fashions” rather than a 
realistic assessment of objectives and capabilities. 
When a new form of habitat modeling becomes more 
popular than an existing technique, it is useful to con- 
sider whether this replacement represents real 
progress. The answer to this query is not simple, be- 
cause there is no good standard for measuring 
“progress.” Changes in approaches to modeling take 
place for several reasons. Often, they are prompted by 
criticism of existing approaches. Usually these criti- 
cisms are targeted at the assumptions of the models 
and therefore driven by a misunderstanding of the 
necessary simplification. Because the factors involved 
in evaluating models are complex, it is sometimes dif- 
ficult to discern valid and useful criticism combined 
with the development of useful new approaches from 
criticism based on misunderstanding of the assump- 
tions and objectives of existing models. Sometimes 
changes are made possible by the development of new 
statistical techniques and/or advances in computer 
hardware and software. 


Matching Goals with Techniques 


The goals of habitat modeling fall into two broad cat- 
egories. The first is a general inventory to establish 
links between habitat conditions or distribution and 
biodiversity (Cablk et al. Chapter 37; Scott et al. 
1993). From this assessment, species that are rare, 
have restricted ranges, or are unexpectedly absent may 
be identified. Such an inventory may serve as the 
basis for management decisions based on the notion 
of maximizing biodiversity. The second goal is to 
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establish links between habitat condition and the via- 
bility of one or several species. Assessing viability re- 
quires knowledge of the processes influencing changes 
in density, which may allow managers or researchers 
to predict the effects of ongoing or expected habitat 
change on the target species. 


Assessing Biodiversity: Predicting 
Species Occurrence in a Landscape 


Organized inventories of species in which presence/ 
absence is associated with habitat categories may be 
used to identify areas of high biodiversity to develop 
protection schemes that maximize the total number of 
protected species or to identify a loss of species associ- 
ated with anthropogenic habitat changes, including 
those resulting from fragmentation. Multispecies in- 
ventory is an ambitious undertaking, and a relatively 
high level of error is expected and tolerated so that an 
expansive view can be developed (Noss et al. 1997). 
Lack of attention to processes driving the observed 
patterns of habitat occupancy make these approaches 
easy both to criticize and to misuse, but such ap- 
proaches are an essential first step in identifying con- 
servation concerns. 

Methods for systematically organizing information 
about multiple species and the habitats they occupy 
fall broadly under the WHR (wildlife habitat relation- 
ships) rubric (Nelson and Salwasser 1982; Patton 
1978; Thomas 1979; Verner and Boss 1980; Hoover 
and Wills 1984). Such methods may help to organize 
information during the inventory phase but do not in- 
clude or predict information about densities, popula- 
tion trend, or habitat patch size and configuration. 

The Gap Analysis Program (GAP) of the USGS Bio- 
logical Resources Division (Scott et al. 1991a, 199ib, 
1993) is a systematic means of inventorying vegeta- 
tion using GIS (geographic information system) map- 
ping procedures, predicting the occurrence of wildlife 
species in different habitats, and using this informa- 
tion to identify priority areas for protection. Used ap- 
propriately, gap analysis can encourage consideration 
of biodiversity patterns at a broad scale (among habi- 
tats) rather than focusing on local, within-habitat bio- 
diversity. Because it is conducted over large areas with 
coarse-grained resolution, gap analysis is less useful 


for evaluating management techniques (e.g., burning 
or logging), for predicting the effects of landscape pat- 
tern (e.g., importance of ecotones or patch sizes), or 
for understanding the processes governing the pres- 
ence and absence of species. 

One danger of biodiversity models is that the out- 
put of the models can allow managers to follow pre- 
scriptions thoughtlessly. For instance, the model may 
suggest that protecting a series of habitat types in cer- 
tain proportions will maximize biodiversity. Such a 
prescription, however, ignores individual habitat re- 
quirements of species (e.g., a minimum area or config- 
uration to protect a sustainable population), species 
interrelationships (e.g., loss of a keystone predator 
[Paine 1966] may have ramifications that loss of an 
herbivore would not), and species valuations. Species 
valuations may be based on species’ popular 
“charisma,” trophic level, whether or not a species is 
native and/or endemic to a particular area, or rarity 
outside the focal area of study. Some species may have 
value because they attract popular attention (e.g., rap- 
tors, grizzly bears, apes, porpoises, penguins). Species 
at a higher trophic level (e.g., wolves, spotted owls) 
may be more valued for the role they play in the com- 
munity or because their habitat requirements may 
provide an “umbrella” for prey species. Defining na- 
tive species may present difficulties where species’ 
ranges have expanded as a result of anthropogenic 
change. For instance, riparian-associated species, such 
as indigo buntings (Passerina cyanea) and yellow- 
bellied sapsuckers (Sphyrapicus varius), have ex- 
panded their ranges westward in North America along 
the Platte River because of its now-artificial flow 
regime, enhancing the biodiversity of riparian areas. 
Rarity outside the focal area is tied to the notion of re- 
gional- or gamma-diversity and can present difficulties 
in species inventory approaches (Samson and Knopf 
1982). Species contributing to higher diversity along 
the Platte River are more likely to be represented else- 
where in North America than are members of the rel- 
atively depauperate species community of the sur- 
rounding prairie (Knopf 1985, 1986). 

How fragmentation effects are incorporated into- 
biodiversity models is another area of difficulty. Be- 
cause individual species differ in home-range size, 
minimum population sizes, effects of habitat edges, 


and other factors, the effects of a given level or form 
of fragmentation of habitat will vary among species 
(Bierregaard et al. 1992). Such variation is incorpo- 
rated implicitly into species/area-based models, such 
as those based on the Theory of Island Biogeography 
(MacArthur and Wilson 1967), but such a level of 
generalization does not allow for predictions about ef- 
fects on individual species. Use of species/area curves 
to model biodiversity may mask important relation- 
ships between the invasion of exotics and species di- 
versity, as when invasion rates increase with biodiver- 
sity (Stohlgren et al. 1999). 


Assessing Species Viability: 
Predicting the Effects of Habitat Change 


A species may be highly valued or of special concern 
either in its own right or because it serves as an indi- 
cator or umbrella species. In such cases, focal studies 
may be used to predict the effects of anthropogenic 
change or management actions on the viability of the 
species. 


Species-habitat Correlations 


Analyses of the relationship between the co-occur- 
rence of individuals of a species and habitat variables 
have often been used to identify high-quality or essen- 
tial habitat for a species. Conceptually, this approach 
follows from the n-dimensional niche described by 
Hutchinson (1957), in which the niche of an organism 
was described by axes, each representing an environ- 
mental variable. It follows that, if we correlate the oc- 
currence of an organism with a series of environmen- 
tal variables in a given study area, we may be able to 
describe the niche of the organism and, hence, predict 
its pattern of habitat occupancy in other areas. The 
process of gathering data for such analyses is straight- 
forward. Individuals are surveyed and their presence 
or absence is associated with habitat variables col- 
lected nearby. The underlying philosophy of such an 
approach is that a higher level of objectivity is 
achieved by collecting the data and massaging them to 
life so that they may speak to us about the organism’s 
view of its environment, rather than imposing our 
own intuitions or natural history-based knowledge to 


structure the analysis. 
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Problems with such a correlative approach are 
found in at least four areas. The first of these is in re- 
ducing the number of variables. All analytic ap- 
proaches using these sorts of data must deal with the 
statistical problem that one can measure many more 
variables that may influence species’ occurrence than 
there are observations or samples in most of these 
data sets. During the 1970s, the use of multivariate 
approaches based on either principal components or 
discriminant analysis was common (Capen 1981). Be- 
cause such multivariate approaches use the correlation 
structure within the habitat variables to produce a few 
synthetic axes that can later be interpreted, it was 
thought that these provided an objective means of dis- 
cerning habitat relationships. This idea was bolstered 
by the view that organisms viewed their environment 
in a synthetic or “gestalt” fashion (James 1971). Mul- 
tivariate approaches faltered, however, when it was 
pointed out that assumptions of these analyses were 
seldom met, so that the significance values assigned to 
habitat relationships were questionable (James and 
McCullough 1990). The problem of reducing the 
number of variables still exists, however. Although we 
can all agree that it is important to take the point of 
view of the organism in choosing the variables to 
measure (Morrison et al. 1992, 1998), this is exceed- 
ingly difficult. Consequently, the variables chosen for 
correlation are generally those that we can measure 
easily. 

The second problem in using correlations arises 
from the shape of the variable-response relationship. 
Hutchinson’s concept of the niche (Hutchinson 1957), 
and explorations of the nature of niche breadth and 
overlap that followed, envisioned a normal, Gaussian 
response by the organism to a gradient in the variable 
or resource. Where such a curve exists, we may record 
an increasing, nonexistent, or decreasing response de- 
pending on the range of the variable that is measured 
(Fig. 4.2). Yet most correlative approaches assume 
that if no linear response is detected, a given habitat 
variable is not an important determinant of species oc- 
cupancy (Austin 1999a). Statistical significance (still 
usually at P < 0.05) has become the measure of “sig- 
nificance” of habitat variables. Logistic regression al- 
lows for one form of nonlinearity in response (Hassler 
et al. 1986), but there are many nonlinear forms 
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Figure 4.2. Limited sampling along a gradient in a habitat vari- 
able may produce a positive (a), nonexistent (b), or negative (c) 
correlation with a species response variable such as density if 
the species response is a variant of a normal curve. It may 
then be useful to describe the boundary conditions for wildlife- 
habitat correlations. These boundary conditions may be set by 
the habitat variable in question or by other habitat variables. 


possible in ecological data. Hypothesizing the best 
model to fit the data a priori is often difficult, and trial 
and error exercises of model fitting weaken final 
inferences. 

The third problem, related to the second, is that the 
nature of the organism response will vary with scale. 
Wiens (1989a) pointed out that the nature of per- 
ceived species-habitat relationships changed with the 
scale (extent and grain) at which they were measured. 
As we scale up or down, the domain investigated, the 
observation set, and the language used will often 
change (Bissonette 1997a). In other words, types of 
variables and the scale the measurements encompass 
change. High density of a species within a 1-hectare 
area may be qualitatively different and have a differ- 
ent effect on population change than high density of 
the same species over a 100-hectare area. Because 
habitat selection may itself take place at multiple 
scales (Bissonette et al. 1997a; Hildén 1965; Hutto 
1985; Wiens et al. 1987), it may be necessary to incor- 
porate measurements of variables at these multiple 
scales into predictive models of habitat use. It is there- 
fore important to carefully match the question to the 
scale and to ask questions at different scales of extent 
and levels of resolution. 

The fourth problem is the classic conundrum that 


correlation does not signify causation or process. Be- 
cause causal factors may change without concomitant 
changes in the values of the habitat variables meas- 
ured, correlations may break down in space and time. 
A special case of this results form a source-sink habi- 
tat structure (Pulliam 1988) that can cause densities to 
build up in low-quality habitats (Van Horne 1983) as 
a result of social structuring (Fretwell 1972). Similarly, 
loss of habitat may cause increased density in remain- 
ing habitat. There may be little effect on population 
processes where habitat is not limiting, as in some sit- 
uations with migratory birds (Goss-Custard et al. 
1994). In such situations, correlations of habitat vari- 
ables with species density are likely to overestimate 
the effect of habitat loss because they don’t incorpo- 
rate long-term increases in density in the remaining 
habitat, nor do they include information about the ef- 
fect of such increased density on population processes 
(Goss-Custard et al. 1994). 

Because ecological systems are highly variable, an 
experimental approach is necessary to define the 
cause-effect relationship. I caution, however, against 
viewing experimental approaches as the solution to all 
dilemmas in understanding relationships between 
wildlife and their habitats. Experiments are limited in 
scope and can only test for effects of one or a few vari- 
ables in isolation. In addition, not all experiments elu- 
cidate cause-and-effect processes. For example, an ex- 
periment determining habitat association will not 
necessarily expose the cause of that habitat associa- 
tion. Often, the most profitable procedure is to mix 
experimental and descriptive approaches. Descriptive 
approaches might define patterns of habitat occu- 
pancy for a squirrel across its range, while experi- 
ments might be used to distinguish the roles of habitat 
structure and mast availability in determining habitat 
use or patterns of survival and reproduction at a par- 
ticular site. 


Knowledge-based Models 


Apparent instability of habitat correlations has re- 
sulted in part because such relationships mask the 
processes that drive population change and, in part; 
because the relative importance of different processes 
may change with time, space, and scale. Frustration 
with such instability led to the development of the 


HEP (habitat evaluation procedures; USFWS 1980, 
1981a) and HSI (habitat suitability index; Scham- 
berger et al. 1982) approaches. HEP is a systematic 
means of assessing habitat conditions and manage- 
ment alternatives using habitat units of assigned qual- 
ities, while HSI is a means of assessing habitat quality. 
Although HSI models may make use of correlations, 
they differ fundamentally from the correlative ap- 
proaches described above in that the key causal rela- 
tionships are not expected to emerge from the analy- 
sis. Rather, they rely on knowledge of natural history 
and processes that is derived from researchers familiar 
with the target species. Development of an HSI model 
often involves assembling such *experts? and forcing 
them to draw relationships between habitat quality 
and key life-history attributes (even in the absence of 
P « 0.05 data). Thus, a synthetic variable uses logical 
relationships (either/or/and) to derive a single measure 
of habitat suitability. Used properly, such models rep- 
resent a valuable approach to structuring our knowl- 
edge of animal-habitat relationships and identifying 
areas in which further research is needed (Van Horne 
and Wiens 1991). There is the opportunity to include 
nonlinear relationships in these models where we have 
sufficient knowledge of the shape of these relation- 
ships. As currently constructed, however, we cannot 
expect these simplistic models to make precise predic- 
tions about the nature of habitat occupancy. Hence, it 
is difficult to *validate" such models or even to under- 
stand what might constitute validation. Further, they 
don't tell us much about population change or viabil- 
ity, nor do they do a good job of handling fragmenta- 
tion effects or other spatially mediated processes. 
Andrewartha and Birch (1984) developed the envi- 
rogram approach, a diagrammatic means of conceptu- 
alizing the factors influencing species abundance. In 
their scheme, factors that influence the species directly 
(what I would call key processes) fall into four cate- 
gories they refer to as resources, mates, malentities 
(e.g., weather, competitors), and predators. Outside of 
this centrum, a web of biotic and abiotic factors influ- 
ences the species indirectly; the research objective is to 
determine which of these pathways are important. 
Thus, the indirect effects must be explicitly tracked 
through the factors producing the direct effects, forc- 
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ing researchers to conceptualize the cause-effect link- 
age between habitat variables and species occurrence. 

Burnham and Anderson (1998) describe the use of 
Akaike's Information Criterion (AIC; Akaike 1978a,b; 
1981a,b) to select the most appropriate model from a 
set of possible models or hypotheses determined a pri- 
ori. When information about an animal's natural his- 
tory is used to establish the set of possible models, this 
approach combines knowledge-based and correlative 
approaches while avoiding the artificial standards 
(e.g., P < 0.05) for selecting an appropriate model. 
This approach may be particularly useful for identify- 
ing variables representing real effects on the species 
prior to experimental manipulation. 


Population Models 


Viability models are classic population or metapopula- 
tion models linked to habitat or environmental condi- 
tions to predict changes in population size that may 
result from different management policies. These are 
used increasingly to address problems with endan- 
gered or threatened species (Noss et al. 1997). These 
models may be useful to encapsulate our best predic- 
tions of population-habitat associations to allow for 
relative comparisons of management policies. Because 
of the difficulties of understanding causes of popula- 
tion change and modeling stochastic events, they are 
not likely to predict accurately the patterns of popula- 
tion increase, stasis, or decrease. 

Where factors influencing survival and reproduc- 
tion are documented, a life-table approach may pro- 
vide important insight into the effects of habitat 
change. For instance, if the life table is more sensitive 
to small variation in adult survival than to variation in 
reproductive rates or juvenile survival, factors influ- 
encing the quality of wintering habitat may be more 
critical than those influencing breeding habitat. Ob- 
served life histories are an integration of genetic or 
phylogenetic and phenotypic factors. Differences 
among habitats in life histories may be important (Van 
Horne 1983). To an extent, life history can buffer 
habitat effects, as, for example, where the organisms 
can respond to decreased adult survival with increased 
investment in current reproduction. Where this is the 
case, models using demographic constants, such as per 
capita birth or death rates, may be erroneous. Recent 
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advances permit demographic parameter estimation 
from mark-recapture models and ecological covariates 
(White and Burnham 1999). This approach has been 
used to rank habitats according to their relative effects 
on life-history traits of northern spotted owls 
(Franklin 1997). 


Individual-based Models 


Models that track individual responsés to habitat and 
project them to population responses are more likely 
to include causal processes than those relying strictly 
on correlations or population-level responses. Such 
models, however, are highly specific and tend to be 
very complex. Goss-Custard et al. (1994), for exam- 
ple, have built individual-based models that predict 
the proportion of Eurasian oystercatchers (Haemato- 
pus ostralegus) emigrating or dying because of habitat 
loss or change. These predictions are then used to 
project population numbers at larger scales. 

Mangel and Clark (1988) have translated dynamic 
programming approaches used in engineering for the 
ecological sciences. Their approach to predicting indi- 
vidual behavior allows the organism to make choices 
based on fixed or stochastic values of risk and reward 
associated with these choices as they influence survival 
and/or reproduction, the components of fitness. 
Choices among habitat types can be modeled in terms 
of causal factors, such as predation probability and 
food availability, associated with each habitat patch. 
Physiological states that are correlated with survival 
and reproduction, such as the probability of maintain- 
ing a threshold level of body fat, can then be used as 
the currency to evaluate the choices. For example, 
Farmer and Wiens (1999) have used such an approach 
to gain insight into migratory movements between 
stopovers and reproductive patterns in pectoral sand- 
pipers (Calidris melanotos). 

Models in biology are often based on new tools 
that become available. Radiotracking (telemetry) rep- 
resents one such tool. Manly et al. (1993) describe re- 
source selection functions that can be used to interpret 
results independent of the number and types of habitat 
identified. Early modeling of habitat preference based 
on comparison of used and unused habitat as deter- 
mined by radiotracking, however, was roundly and 
justifiably criticised (Hobbs and Hanley 1990), be- 


cause the investigators’ choice of habitat to be in- 
cluded in the available but unused portion greatly in- 
fluences the outcome of the analysis. 


Modeling the Effects of Animals on Their Habitats 


Most approaches discussed thus far have considered 
habitat as a factor that influences animal populations 
and have ignored influences in the opposite direction. 
The concept used is one of a habitat (usually defined 
by vegetation composition or structure) that is taken 
as a given, which the organism may then choose or 
not choose to occupy. Over time and/or as densities in- 
crease, however, animals may modify their biotic and 
abiotic habitats, indirectly influencing their own den- 
sity as well as that of other species by activities such as 
grazing, predation, digging, or scavenging. In extreme 
examples, such as the American beaver’s (Castor 
canadensis) modification of its habitat through dam 
construction, the term “ecosystem engineers” is used 
(Lawton and Jones 1995). Community or ecosystem 
models are more likely to address such effects because 
the former models include feedbacks between species, 
while the latter emphasize dynamic flows of energy 
and nutrients that essentially change the habitat avail- 
able over time. 


The Use of GIS Information in Habitat Models 


Geographic information systems (GIS)-based informa- 
tion has vastly expanded our ability to construct spa- 
tially explicit models of animals in actual or projected 
habitat configurations. Because this tool encourages us 
to use actual habitat maps, it may impart a feeling of 
power and omniscience on the part of the habitat 
modeler. The emergence of new tools in ecology is 
often followed by a period of euphoria and rampant 
collection of new types of data (e.g., genetic tech- 
niques, radiotracking) before methods of analyzing 
and using such data have been critically evaluated. 
GIS opens many new possibilities, but I would like to 
post some cautionary notes before we leap wholesale 
into this new, spatially explicit world. The first of 
these is that problems in variable selection are in- 
creased by orders of magnitude in spatially explicit 
landscapes. The number of variables that become 
available in landscapes is staggering (cf. Gustafson 
1998); these range from geometric variables, such as 


patch size, perimeter, shape, density, connectivity, and 
fractal dimension, to continuous variables sampled at 
points using trend surfaces, correlograms, semivari- 
ograms, autocorrelation indices, or interpolation such 
as kriging, to synthetic variables such as viscosity. 
With so many variables possible, allowing the impor- 
tant variables to emerge from an “objective” analysis 
(that is, allowing the data to speak for themselves) is 
virtually impossible, and the process of a priori vari- 
able selection supersedes later analytic processes in 
importance. It seems evident that an informed choice 
of variables, based on knowledge of processes that 
drive species occurrence, is essential to the success of 
such analyses. 

A second difficulty with GIS approaches is the scale 
imposed by pixel size and/or by analytic limitations 
that require pixels of a minimum size. Appropriate 
pixel size depends on whether the analysis is individ- 
ual- or population-based. In the former case, it should 
be based on the minimum size of habitat unit that in- 
fluences habitat choice, while in the latter case it 
should be based on the size of a single within-season 
home range or the minimum habitat patch that can 
support an individual. 

A third difficulty is that the variables that define a 
given pixel may not be highly correlated with those 
that influence the organism directly. If this is the case, 
it is impossible to develop a predictive habitat model 
based on GIS procedures alone. 

A fourth difficulty is that spatially explicit analyses 
tend to be highly empirical and have relatively little 
theoretical foundation. Hence, the ability to generalize 
from a single analysis is severely limited. 

Fifth, GIS analyses are usually pattern- rather than 
process-based (but see Goodchild et al. 1993, 1996). 
That is, it may be simplest to match animal occurrence 
with pixel type; this matching may give some clues 
about possible processes driving populations, but it is 
well removed from these processes. It does not, there- 
fore, follow that if we manage to increase habitat fea- 
tures associated with a certain pixel type in which a 
target species is commonly found, we will necessarily 
increase the numbers of individuals of that species. 
For instance, it may be that the species is found in 
*wetland" pixels on our map, and all such pixels on 
our map are near a river. Proximity to the river may 
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be critical to the species, but this would not be evident 
from our analysis. Any subsequent attempt to increase 
wetlands away from the river would then fail to in- 
crease the target species. Knowing how the animals 
use their habitat would be critical to interpreting our 
initial GIS analysis. 

Finally, I would like to highlight a related problem. 
It is expensive, difficult, and time-consuming to put 
people into the field. It is much simpler to work di- 
rectly with maps and computers. It is easy to succumb 
to the temptation to believe, for instance, that maps of 
vegetation types based on GIS are accurate and to 
shortcut the field verification of vegetation composi- 
tion. Similarly, researchers may believe that they know 
which animals are associated with each vegetation 
type, without having to verify that the animals are ac- 
tually there. It is easy to think that we know which 
variables to include in our analyses without actually 
observing the annual pattern of habitat use of the tar- 
get species. It is easy to ignore variation in survival 
and reproduction across the landscape that may be 
obscured by dispersal into low-quality habitats. The 
use of GIS encourages us to cut corners and to believe 
results without field verification. Usually, error associ- 
ated with measurement, scaling, and extrapolation is 
ignored in describing the variances associated with 
model output. We need to make sure the standard of 
acceptance of GIS analyses does not encourage lazi- 
ness and corner cutting. 


Conclusions 


Habitat models that are very specific will not be useful 
for making predictions of management effects across 
time and space unless enormous resources are devoted 
to their development, as occurred for the spotted owl 
(Forsman et al. 19962). They may, however, elucidate 
processes that are important in driving population 
change; such processes can be incorporated into more- 
generalized models that do have predictive power. 
This process of generalizing requires that boundary 
conditions for a given habitat model be defined. Such 
definition may require the use of gradient-based rather 
than strictly categorical or ANOVA-based approaches. 
Models that achieve generality by collecting data at a 
very broad scale may be useful in the definition of 
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problems across habitat types, in initial inventories, 
and in identification of large tracts of land for preser- 
vation. Because they are so distantly related to the 
processes that drive population change, however, they 
will not be useful in identifying the effects of process- 
oriented management policies regarding timber har- 
vest, stream flow, grazing, hunting pressure, and the 
like. If we pay close attention to matching model ob- 
jectives to the model type selected, rather than 
thoughtlessly embracing the latest habitat analysis 
technique, we are likely to make better progress to- 
ward meeting management objectives. Certainly, we 
should match the questions we ask to the scale at 


which we gather information and model. We should 
also de-emphasize model validation and accept the po- 
sition that habitat models are a means of quantita- 
tively assembling our best knowledge of animal-habi- 
tat relationships to make the most informed decisions 
possible and to identify research needs, rather than 
expecting the models to be predictive with P < 0.05. 
Finally, we should take advantage of GIS approaches 
to enhance our understanding of spatially expli- 
cit processes driving population change rather than 
using the tools and maps as an excuse to become lazy 
about the field work necessary to understand these 
processes. 


CHAPTER 


3 


Case Studies of the Use of Environmental 
Gradients in Vegetation and Fauna Modeling: 
Theory and Practice in Australia 


and New Zealand 


Michael P. Austin 


K nowledge of the distribution of species is central 
to the conservation of biodiversity. Predicting the 
distribution of species whether plant or animal de- 
mands a high degree of knowledge in ecology, statisti- 
cal modeling, geographic information systems (GIS), 
and remote sensing to be cost-effective. Wildlife scien- 
tists (Verner et al. 1986a; Scott et al. 1993) and ecolo- 
gists (Margules and Austin 1991) have recognized the 
need to predict distribution but have developed their 
own paradigms (Margules and Redhead 1995; Scott et 
al. 1996). A degree of convergence between paradigms 
with respect to spatial modeling is now emerging (Lin- 
denmayer et al. 1990a, 1991b; Pereira and Itami 
1991; Mladenoff et al. 1995; Pausas et al. 1995), as 
evidenced by contributions in this book (e.g., Young 
and Hutto, Chapter 8; Fertig and Reiners, Chapter 
42). Any study of species distribution has several com- 
ponents. A certain ecological theory is assumed, either 
explicitly or implicitly. Environmental variables are se- 
lected as important for estimating species distribution. 
Certain sources of data or methods of survey design 
are accepted as suitable. Decisions are made about the 
use of GIS or remote sensing. Habitat scoring systems 
are devised or particular statistical modeling methods 
adopted. These decisions will reflect the research para- 
digm of the researcher but governing all this will be 
the pragmatic decisions necessitated by available data, 
skills, time, and resources. In this chapter, attention is 


focused on the role of environmental variables in eco- 
logical theory, survey design, and statistical modeling 
as used to predict species distributions, particularly in 
Australia. 


Theory 


The common and pragmatic assumption of homoge- 
neous vegetation communities for conservation plan- 
ning does not have a sound theoretical base. The con- 
tinuum concept where species composition varies 
continuously along environmental gradients (Austin 
and Smith 1989) is more consistent with early quanti- 
tative studies (Whittaker 1956; Curtis 1959) and re- 
cent statistical modeling studies (Austin et al. 1990; 
Austin et al. 1994; Leathwick and Mitchell 1992; 
Leathwick 1995). Each species shows individualistic 
distribution patterns in relation to environmental vari- 
ables, particularly climatic variables such as tempera- 
ture and solar radiation, and local variables such as 
topographic position. Homogeneous communities of 
co-evolved species with discrete boundaries along en- 
vironmental gradients do not exist. However, the most 
frequent combinations of environmental conditions in 
a landscape will have characteristic combinations of 
species that can be recognized as communities for 
management purposes (Austin and Smith 1989). Shift- 
ing combinations of environmental conditions in the 
landscape and climatic gradients across regions will 


ies 
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restrict such community types to local regions only 
(Austin 1991). 

Whittaker (1956, 1960) and Curtis (1959) pio- 
neered the continuum concept for plants and demon- 
strated the individualistic behavior of species, but it is 
not often considered for vertebrate fauna (Verner et al. 
1986b; Austin 1999a). Yet, the distribution of a 
species in an environmental space, the niche hypervol- 
ume of Hutchinson (1957), is a basic tenet for devel- 
oping a predictive model of a species distribution. For 
successful modeling, however, it is necessary to recog- 
nize the nature of the environmental space being used. 
Three idealized types of environmental gradient can 
be distinguished, though all kinds of intermediates 
may occur (Austin and Smith 1989; Huston 1994; 
Austin 1999b). 

Indirect gradients (complex gradients, Whittaker 
1978) have no direct physiological effect on growth. 
Altitude is an example; it is correlated with variables 
that do have an effect on growth, such as temperature 
and rainfall. The correlation of altitude with both 
temperature and rainfall will be location-specific and 
will confound any interpretation. Analysis of species 
environmental niche using indirect gradients will give 
predictive models that are only valid locally and will 
lack robustness. 

Direct gradients (regulator gradients; Huston 1994) 
are those that have a direct effect on growth but are 
not consumed (e.g., temperature). Temperature may 
have numerous direct effects. There is only limited 
theory to guide what the expected shape of a species 
response to a temperature variable might be, beyond 
the probability that it will be unimodal (Austin 1992). 
Predictions may be robust if a suitable expression of 
the temperature gradient is found. 

Resource gradients are those where the resource 
(e.g., soil nitrogen) is consumed by a plant. The prox- 
imal causal resource will often be unknown (eg, 
phosphate concentration at the surface of the root 
hair). Surrogate distal variables such as available soil 
phosphate are usually used. The species response func- 
tion is likely to be hyperbolic or perhaps unimodal. 
The more proximal the variable the more robust will 
be the predictions. 

These gradients are idealized contrasting types. Soil 
moisture may be a resource gradient at low levels but 


an indirect gradient at high levels. The waterlogged 
soil at high levels will be anaerobic and toxic to many 
terrestrial plants; the direct variable may be lack of 
oxygen or presence of sulfide ions. Knowledge of envi- 
ronmental processes and the likely nature of species 
response functions is the key to developing robust 
models of species distributions. 

Models for fauna differ from those of plants with 
variables for food supply, nesting sites, protection 
from predators, and size of territories more important 
than environmental conditions. Choice of appropriate 
predictors depends on having a suitable theoretical 
framework, knowledge of relevant environmental 
processes, and ecological expertise on the organisms 
being studied. No amount of statistical modeling can 
compensate for poor definition of the problem. 


Data 


There exist a wide variety of data types arising from 
biological sources and GIS and remote sensing devel- 
opments. Large conservation studies have had to de- 
pend on the data available to them rather than the 
data most suitable for prediction (Scott et al. 1996). 
Frequently the only data available are herbarium 
records or museum specimens. Such presence records 
without equivalent records of absence can only be 
used for statistical models that are conditional on the 
presence of the species. For example, models of species 
abundance can be developed based on the premise 
that the species is present. Methods exist for predict- 
ing the potential occurrence of species based on pres- 
ence data and climate records (Busby 1991; Austin 
1998). These methods predict the climatic envelope 
within which a species may occur, but they can say 
nothing about where it will be absent within climatic 
limits of the envelope. 

A minimum data set suitable for statistical model- 
ing is the presence/absence records for a geocoded plot 
of specified size. If the plot's location has been 
recorded, then environmental predictors can be cap- 
tured via remote sensing and GIS layers. Numerous 
surveys exist that can provide such data for plants. 
Data for animals is less readily available and is usually 
conditioned on collecting effort. Collation of existing 
vegetation surveys in southeastern New South Wales 


(NSW) has provided a data set of over nine thousand 
plots (Austin et al. 1994). The problem with collated 
data sets of this kind is that they may represent a bi- 
ased sample from the region (Stockwell and Peterson, 
Chapter 48). 


Survey Design 


Environmental gradients can play a significant role in 
survey design where unbiased species data from 
geocoded locations is unavailable, and cost-effective 
surveys are needed (Margules and Austin 1991). To 
ensure a representative sample of the range of vegeta- 
tion composition, a survey should sample the environ- 
mental space defined by the major influential envi- 
ronmental gradients in the region. Choice of the 
gradients constitutes a hypothesis stating the pre- 
sumed principal environmental controls in the region. 

Austin and Heyligers (1989, 1991) describe a pro- 
cedure called gradsect sampling that is designed to 
provide a representative sample of the environmental 
space of a region. Transects are selected to traverse the 
major environmental gradients of a region in such a 
way as to minimize travel times and access problems. 
This gives a biased sample; areas not on the chosen 
gradsects cannot be selected for sampling. The aim of 
such a survey is to sample the potential range of vari- 
ation in vegetation composition, not to obtain an un- 
biased estimate of some mean value for the region. 
The survey design approach has been termed the SR3 
strategy, where S = Stratification, R = Representation, 
R = Replication, and R = Randomization (Austin 
1998). Where access is less of a problem, a modified 
version of the design can be adopted. 

A recent vegetation survey of the Mid-Lachlan Val- 
ley in central New South Wales (Austin et al. 2000) 
provides an example (Fig. 5.1) for an area of 22,500 
square kilometers. Available GIS layers for the study 
area consisted of a digital elevation model (DEM) and 
polygon-based soil landscape maps (Kovac et al. 1990 
unpublished maps). Climate layers for mean annual 
temperature and mean annual rainfall were derived 
from the DEM using ANUCLIM (Hutchinson et al. 
undated). Four environmental factors were assumed 
to be important in the area: temperature, rainfall, soil 
type, and topographic position. The first two were 
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Figure 5.1. Location map for Mid-Lachlan survey in New South 
Wales, Australia. 


considered to be direct gradients. Soil type is a surro- 
gate categorical variable for soil nutrients and soil 
moisture. Topographic position is an ordered categor- 
ical variable representing an indirect gradient, the soil 
catena along which numerous soil variables are con- 
founded. Mean annual temperature (9.1 to 17.0 de- 
grees Celsius) and mean annual rainfall (350 to 1,330 
millimeters) were divided into seven and eight classes, 
respectively. There were 153 soil landscape types rep- 
resenting extensive depositional landscapes separated 
by low north-south mountain ranges with shallow 
skeletal soils. When these landscape types were over- 
laid with the climatic layers in the GIS, they formed 
628 environmental stratification units (ESUs) for the 
area, due to the correlation between GIS layers. 
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The SR? strategy was achieved by 


1. Geographical stratification (S): The primary stra- 
tum was the nine 1:100,000 mapsheets of the area. 
The ESUs on each mapsheet were then sampled in- 
dependently. 

2. Environmental representation (R): A representative 
sample of the environmental space is obtained by 
sampling each ESU occurring on the mapsheet (see 
Fig. 5.2 in color section). l 

3. Replication (R): This was achieved by taking sam- 
ples from each ESU in proportion to the area of 
each ESU on the mapsheet following rules similar 
to those used by Austin and Heyligers (1989). 

4. Randomization (R): This is only strictly feasible for 
those ESUs that have large scattered areas on a sin- 
gle mapsheet. However, the repeated sampling of 
the same ESU on different mapsheets will result in 
a systematic random sampling of the geographical 
range of the ESU. 


A two-stage sampling procedure is introduced to 
incorporate the fourth potential predictor, topo- 
graphic position, because it operates at a different ge- 
ographical scale from the other variables. At each 
sampling site (approx. 1 square kilometer), three plots 
from different topographic positions were measured 
for vegetation composition (Fig. 5.3). There were two 
pragmatic decisions made for logistical reasons. Only 
presence/absence of trees and shrubs taller than 1.5 
meters were recorded for each plot, and plots wher- 
ever possible were located on roadsides and public 
lands to avoid the time-consuming effort of seeking 
permission to sample on private land. 

This survey design is explicit, consistent, and re- 
peatable. Data from plots provide the evidence for in- 
terpretive vegetation maps, and that evidence is only 
as good as the survey design (Table 5.1). The design 
depends on extensive knowledge of the variables influ- 
encing the composition of eucalypt forests and wood- 
lands (Austin et al. 1997). There is one assumption 
among the many of this approach, which is not always 
made explicit: the vegetation is in equilibrium with the 
environment as defined. There is growing circumstan- 
tial evidence from New Zealand that this assumption 
may not be true. There is evidence that species distri- 
butions are not, yet in equilibrium after a volcanic 
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Figure 5.3. Plot locations sampled based on the SR? survey 


strategy in Mid-Lachlan study area after taking account of ex- 
isting survey plots. 


eruption eighteen hundred years ago (Leathwick and 
Mitchell 1992) or after the last glacial period due to 
the slow rates of re-invasion of the dominant species 
from the genus Nothofagus (Leathwick 1995, 1998; 
Leathwick et al. 1996). Provided the primary geo- 
graphical stratum—the map sheet—is used in the de- 
sign of surveys, this need not be an issue. It will how- 
ever require explicit consideration in amy statistical 
modeling that is done. | 


Statistical Modeling 


The use of statistical models for prediction depends on 
the availability of suitable environmental predictors. 


TABLE 5.1. 


Distribution of plots sampled in relation to the climatic 
gradients used to stratify the Mid-Lachlan survey. Nonexistent 


combinations are indicated with a dash (—). 
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These must be ecologically meaningful and exist as a 
GIS layer for the entire region for which predictions 
are required. This reduces the number of possible pre- 
dictors, although the increasing use of remotely sensed 
variables may compensate for this (Aspinall and 
Veitch 1993). The recent availability of new methods 
of regression modeling has allowed the development 
of models more consistent with ecological theory 
(Austin et al. 1994; Franklin 1995). Both generalized 
linear modeling (GLM; McCullagh and Nelder 1989; 
Nicholls 1989, 1991) and generalized additive model- 
ing (GAM; Hastie and Tibshirani 1990; Yee and 
Mitchell 1991) have been used in vegetation studies in 
Australia (Austin et al. 1994; Austin and Meyers 
1996) and New Zealand (Leathwick 1995, 1998). 
Boyce and McDonald (1999) provide a recent review 
discussing some faunal examples; see also Ózesmi and 
Mitsch (1997). GLM greatly expands modeling op- 
portunities compared with classical linear regression, 
in particular making possible the easy analysis of pres- 
ence/absence data with both continuous variables and 
factors (Nicholls 1989; Austin et al. 1990). GAM al- 
lows the fitting of nonparametric functions to the data 
using a smoothing algorithm while maintaining signif- 
icance testing (Yee and Mitchell 1991). The great ad- 
vantage is that the exact shape of a species response to 
an environmental predictor does not have to be speci- 
fied prior to fitting the model. Given that theory sug- 
gests that a great variety of shapes are possible (Austin 
and Smith 1989; Austin 19992), this is an important 
advantage. There are numerous other statistical meth- 
ods that deserve attention; decision trees (Breiman et 
al. 1984; Walker 1990: Stockwell et al. 1990; Lees 
and Ritman 1991), LOWESS (Locally Weighted Sums 
of Squares, Cleveland 1979; Currie 1991), and neural 
networks (Caudill 1990; Fitzgerald and Lees 1992). 
How to evaluate the relative performance of these 
methods is a continuing issue (Austin 1994; Austin et 
al. 1995; Walker and Aspinall 1997). 

Environmental predictors for eucalypt species mod- 
eling at a regional scale in Australia have generally 
been climatic variables with temperature being the 
most important (Austin et al. 1997). Other useful pre- 

dictors have been lithology, topographic position, 
solar radiation corrected for aspect and slope, and soil 
fertility index. Some early work at a more local scale 
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indicated that moisture stress based on soil water bal- 
ance models and interspecific competition could con- 
tribute to statistical models but could not be used for 
GIS-based prediction in their current form (Austin et 
al. 1997). Leathwick (1995, 1998; Leathwick et al. 
1996) has successfully used soil moisture models to- 
gether with climate variables and interaction terms for 
forest models in New Zealand. Interestingly, solar ra- 
diation is a more important predictor than rainfall in 
New Zealand as compared with Australia. 

None of the array of statistical modeling techniques 
currently available to ecologists is without problems. 
The regression-based techniques of GLM and GAM 
are both limited by multicollinearity and the inade- 
quacies of stepwise regression procedures (James and 
McCulloch 1990). No consensus yet exists on the 
choice of spatial modeling method. Comparative eval- 
uations tend to champion one method. Two recent pa- 
pers have recommended neural networks over multi- 
ple regression (Lek et al. 1996c; Guégan et al. 1998). 
In one case (Guégan et al. 1998), only linear terms 
were used in the regression, and, in the other case (Lek 
et al. 1996c), use of best current practice with GAM 
and appropriate analysis of residuals might have 
yielded entirely different conclusions. 

Comparison of the performance of different tech- 
niques using real data is fraught with difficulties. 
Three techniques may give one answer, while another 
three give a different answer but with similar explana- 
tory power. Two possible conclusions might be that 
(1) all six techniques provide an inadequate answer or 
(2) each group of techniques finds half the answer, and 
no technique gives the full answer. Alternatively, 
model specification of predictors was inadequate, and 
better ecological and statistical skills would have given 
a much better model independent of the particular 
technique used. Comparisons of statistical modeling 
techniques are only possible where “truth” is known. 
To generate realistic “true data” requires theoretical 
knowledge of how species respond to environmental 
gradients and knowledge of the environmental pro- 
cesses, which generate the multicollinearity between 
environmental predictors. 

Austin et al. (1995) have attempted to investigate 
this difficult problem. They constructed artificial data 
based on explicit ecological theories about how plant 
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Figure 5.4. Comparison of response functions fitted by different statistical modeling techniques to artificial data for a species with 
a Gaussian response function to the proximal “causal environmental &radients." See text for description of terms. 


species respond to environmental gradients using the 
computer package COMPAS (Minchin 1987). Two 
types of species response were generated based on dif- 
ferent theories, Gaussian (bell-shaped) response 
functions (Gauch and Whittaker 1972), and skewed 
B-functions (Austin et al. 1994). The assumption was 
that the best technique should be robust to changes in 
response function. These responses were then linked 
to two sets of environmental predictors. The first set 
comprised three constructed proximal causal variables 
(temperature, radiation, and soil fertility). The second 
set was derived from the direct gradients via plausible 
environmental process models. For example, the indi- 
rect gradient variables latitude, longitude, and altitude 
were derived from temperature using a real data set 
and a thin-plate spline surface fitted to climate station 


data (Hutchinson 1997). Aspect and slope were de- 
rived from radiation using a real data set and a com- 
plex trigonometric function. In addition, a random 
variable unrelated to the species was added to the en- 
vironmental data. The difficulties of evaluation are 
not discussed here (see Austin et al. 1995) 

Figure 5.4 shows an example of the type of results 
obtained in the simplest case using the predictor. A 
B-function is a particular parametric function suitable 
for fitting certain skewed responses (Austin et al. 
1994). Data restriction is applied when the range of 
observations along the environmental gradient clearly 
exceeds the species environmental niche. This can re- 
duce prediction errors at the edge of the species niche 
(Austin and Meyers 1996). The true Gaussian curves 
are recovered by all the techniques but with varying 


degrees of over- or underestimation. Use of the indi- 
rect predictors results in much more complex failures 
of model-fitting (see Austin et al. 1995). Four conclu- 
sions from the study are relevant here: 


1. A random variable may be selected as a highly sig- 
nificant predictor by GLM and GAM unless special 
attention is paid to error functions and the re- 
sponse curves obtained. 

2. Of the methods tested, GAM is the most successful 
at recovering the true relationships but is not with- 
out problems. 

3. The correct mix of ecological and statistical expert- 
ise is more important than the particular technique 
used. 

4. It is much more difficult to obtain satisfactory un- 
biased predictive models with indirect gradients as 
predictors. 


The use of environmental gradients in predictive 
models of species distribution is a necessary but insuf- 
ficient condition for successful modeling. Species mod- 
els often show spatial autocorrelation of the residuals 
implying that the significance estimates for the predic- 
tors are biased and inflated. Most species distributions 
are spatially autocorrelated as are many environmen- 
tal variables (e.g., rainfall and temperature). Use of 
such predictors will remove some of the autocorrela- 
tion. Any remaining autocorrelation may be due to an 
unknown environmental variable having spatial auto- 
correlation or to species not being in equilibrium with 
their environment. This lack of equilibrium can arise 
from historical factors related to dispersal rates of 
species or local extinction from suitable habitat due 
to, for example, plant collecting or hunting. 

Smith (1994) used a proximity variable, the occur- 
rence of the species within a local neighborhood of the 
observation after fitting an environmental model. 
Leathwick (Leathwick et al. 1996; Leathwick 1998) 
has used this approach to analyze forest distribution 
patterns in New Zealand. The importance of the prox- 
imity variable for southern beech Nothofagus species 
was ascribed to the poor dispersal and invasive capa- 
bilities of these forest dominants and helped to explain 
the existence of “Beech Gaps” (Leathwick 1998). Au- 
gustin et al. (1996) have approached this problem from 
a different aspect studying red deer (Cervus elaphus) 
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distribution in Scotland (see also Boyce and McDonald 
1999). Testing for spatial patterns in the residuals from 
environmental models needs to become standard pro- 
cedure in any species distribution modeling. 


Case Studies of 
Fauna Modeling In Australia 


The understanding and modeling of the distribution of 
forest fauna in southeastern Australia has increased 
greatly in the last fifteen years (Braithwaite et al. 
1989; Cork and Catling 1996; Woinarski et al. 1997; 
Landsberg and Cork 1997; Lindenmayer et al. 1999). 
Braithwaite and colleagues published a series of pa- 
pers on the distribution of arboreal marsupials 
(Braithwaite 1983; Braithwaite et al. 1983, 1984). 
They found that 52 percent of the forested area on the 
far south coast of New South Wales had no arboreal 
marsupials while 63 percent of the arboreals were 
found on 9 percent of the area. This difference is re- 
lated to differences in foliage inorganic nutrient con- 
centrations of potassium, nitrogen, and phosphorus as 
arboreal marsupials are predominantly folivores. Fo- 
liage nutrients are correlated with the occurrence of 
certain eucalypt forest communities and hence to vari- 
ations in lithology (Braithwaite et al. 1984). A thresh- 
old in foliage nutrient levels is apparent below which 
arboreals are almost absent and above which other 
factors appear to operate (Fig. 5.5, Braithwaite 1984). 

Stockwell et al. (1990) modeled the distribution of 
the arboreal marsupial, the greater glider (Petauroides 
volans), in a nearby area using regression and decision 
trees with stand condition, slope, site quality, and fo- 
liage nutrients as predictors. Pausas et al. (1995) used 
GLM to model the original data of Braithwaite et al. 
(1984) on species richness of arboreal marsupials. 
Two models were developed: one for low foliage nu- 
trient concentrations and the other for high nutrient 
concentrations. The model at low nutrients contained 
the predictors foliage nutrient concentration, soil nu- 
trient index, topographic position, hole index, and 
bark index. The first three and the fifth relate to food 
quality. Hole index is a measure of nesting sites and 
depends on tree age and tree species. Bark index is a 
measure of the amount of decorticating bark at the 
site, an indicator of insect availability. Above the 
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Figure 5.5. Numbers of arboreal marsupials in relation to fo- 
liage potassium concentration for each eucalypt community. 
Redrawn from Braithwaite et al. (1984). 


foliage nutrient threshold, only forest structural fea- 
tures were important. The model consisted of the pre- 
dictors: number of trees with diameter greater than 60 
centimeters, an indicator of potential nesting hollows, 
and the proportion and basal area of small trees. This 
model was later used in a forest dynamic simulator for 
predicting the habitat quality for arboreal marsupials 
under different logging regimes (Pausas et al. 1997. 
Pausas and Austin 1998). 

Lindenmayer and colleagues have studied arboreal 
marsupials in Victoria 400 kilometers from the previ- 
ous studies (Smith and Lindenmayer 1988; Linden- 
mayer et al. 1990a,b). The eucalypt forests of the Cen- 
tral Highlands of Victoria are markedly different from 
those of the south coast of New South Wales. They 
tend to form monospecific even-aged stands of either 
Eucalyptus regnans or E. delegatensis (Lindenmayer et 
al. 1990a) as compared with multispecies eucalypt 
communities (Braithwaite et al. 1984). Two arboreal 
species, mountain brushtail possum (Trichosurus cani- 
nus) and the greater glider, were modeled using Pois- 
son regression (Lindenmayer et al. 19902). The model 
for the possum indicated that the preferred areas (3 
hectares) had high numbers of hollow-bearing trees, 
high basal areas for Acacia species, and occurred in 
gullies with few shrubs. The model for the greater 
glider contained only the number of hollow-bearing 


trees and stand age, the species preferring older stands 
originating before 1900. 

These models were subsequently tested on an inde- 
pendent data set and shown to be robust (Lindenmayer 
et al. 1994). Another study (Lindenmayer et al. 1993) 
with the same species compared their occurrence in in- 
tact forests and in forested wildlife corridors. They 
concluded that the probability of occurrence for the 
two species did not differ between the two habitats. 
The model for the greater glider was combined with a 
GIS to predict the species spatial distribution (Linden- 
mayer et al. 1995). A kernel-smoothing procedure was 
used to allow for the spatial dependence present in the 
data. The focus of these studies and others (Linden- 
mayer et al. 1991a,b,c) has been on the existence and 
characteristics of hollow-bearing trees as the primary 
habitat limitation for these species in even-aged re- 
growth forests. A recent attempt to model arboreal 
marsupial species distribution at a landscape scale was 
unsuccessful except for one species (Lindenmayer et al. 
1999). The environmental predictors were estimated 
for 20- and 80-hectare circles surrounding the 
3-hectare survey sites. The only species, the yellow- 
bellied glider (Petaurus australis) for which any of the 
predictors were significant is also the only species with 
a home range of equivalent size (40-60 hectares) to the 
sampling units used. The other three arboreal species 
have ranges less than 6 hectares in size (Lindenmayer 
et al. 1999). The size of plot used for modeling must 
relate to the home range of the species. Foliage nutrient 
concentration does not appear to have been examined 
in this part of Victoria. 

Catling and Burt (1994, 1995a,b) studied ground 
mammals in an area on the south coast of New South 
Wales, north of the area studied by Braithwaite et al. 
(1984). They found a response to foliage nutrient con- 
centration different from that for arboreals. Abun- 
dance of native small mammals ( less than 200 grams) 
was negatively correlated with nitrogen, phosphorus, 
and potassium in the foliage but positively related to 
foliage magnesium (Catling and Burt 19952). Habitat 
complexity (Newsome and Catling 1979) had the 
highest positive influence on ground mammal abun- 
dance. Catling and Burt (1995 b) examined the distri- 
bution and abundance of ground-dwelling mammals 
in relation to different types of environmental 


gradients, including direct gradients (temperature and 
rainfall) and indirect gradients (aspect, slope, and 
topographic position). Although species abundances 
were sensitive to many of the predictors, none were as 
important as habitat complexity (e.g., height and 
cover of shrubs and abundance of logs) reflecting dis- 
turbances such as grazing, fire, and logging. Catling et 
al. (1998) fitted GLM models using both environmen- 
tal gradients and habitat variables. The general con- 
clusion was that variables related to habitat complex- 
ity, eucalypt community, and a nutrient factor based 
on lithology were important determinants of species 
distribution. 

The introduced predators, European red fox 
(Vulpes vulpes) and the domestic cat (Felis catus), 
were abundant in southern NSW and medium-sized 
mammals were absent or in low abundance. In north- 
ern NSW, Catling and Burt (1997) found increased 
abundance of medium-sized native mammals 
(200-6,000 grams) and greater diversity of small 
mammals. The fox was absent from large parts of the 
region and in low abundance elsewhere, suggesting 
that predation is a controlling factor. | 

The importance of different environmental gradi- 
ents and habitat variables, such as tree hollows, varies 
markedly between regions for arboreal marsupials. 
The south coast of NSW is characterized by a diversity 
of mixed eucalypt communities on a wide variety of 
predominantly low-nutrient lithologies (Braithwaite et 
al. 1984). The Central Highlands of Victoria has 
mainly monospecific tree communities on relatively nu- 
trient rich lithologies. The dominant species E. regnans 
and E. delegatensis are readily killed by fire and de- 
velop even-aged stands with overstories of tall dead 
trees. It is not surprising, therefore, that food quality is 
more important in southern NSW and the presence 
and nature of tree hollows more important in Victoria. 
The ground mammals on the south coast of NSW be- 
have differently to arboreal mammals in relation to fo- 
liage nutrients though ground mammal abundance and 
bird species richness behave similarly to foliage magne- 
sium (Braithwaite et al. 1989; Catling and Burt 
1995b). Habitat complexity and the risk of predation 
from introduced foxes are the main determinants. The 
relevant environmental gradients or habitat variables 
for inclusion in predictive models of fauna distribution 
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vary even for the same species from region to region. 
Statistical models and their predictor variables are 
highly contingent on the precise combinations of envi- 
ronmental and biotic conditions in a specific region. 


Discussion 


The ecological theory that determines the success of 
predictive species modeling differs radically between 
plant and animal ecology, yet the central concept of a 
species niche is common to both. The physical envi- 
ronment in terms of climate and soils is clearly more 
important for plants. The extent to which biotic pre- 
dictors are more important than environmental pre- 
dictors will depend on the exact circumstances. The 
existence and nature of source and sink areas within 
the range of an animal species is ultimately contingent 
on environmental differences between the areas. Mor- 
ton (1990), Braithwaite and Muller (1997), and 
Soderquist and MacNally (2000) provide examples of 
the importance of this effect from different parts of 
the Australian continent. Difficulties in incorporating 
these effects in modeling arise when species popula- 
tions are not in equilibrium with the environment and 
invasive species are displacing others. The recent re- 
sults of Leathwick (1998) in New Zealand suggest 
that plant ecologists will need to consider species dis- 
persal abilities and historical disturbance patterns in 
order to develop models that are more robust. Spatial 
autocorrelation can no longer be ignored in species 
modeling. Statistical models of fauna in Australia 
(Catling et al. 1998; Lindenmayer et al. 1999) have 
yet to demonstrate the importance of climate predic- 
tors. However, BIOCLIM models of fauna presence 
records reviewed in Burgman and Lindenmayer (1998; 
see also Nix [1986]) indicate that climate may have 
considerable importance for many faunal groups (cf. 
Currie 1991). The relative importance of environmen- 
tal and habitat variables for fauna groups needs to be 
determined but will depend on the scale of the study. 
Large areas or areas with steep environmental gradi- 
ents are likely to have climatic gradients to which ani- 
mals are sensitive. Where steep environmental gradi- 
ents exist, stratified surveys based on the gradients are 
likely to be necessary if only to reduce the bias in ex- 
isting records. 
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The choice of statistical modeling technique re- 
mains uncertain; the best practice may in fact be to 
combine several techniques. The strengths and weak- 
nesses of different techniques will need to be evaluated 
with “true data.” The construction of realistic ecolog- 
ical data has yet to be fully solved. The transfer of 
techniques from one research area to another without 
testing whether the technique is compatible with the 
new theoretical framework is a common problem 
(Austin 1999a,b). Statistical modeling cannot substi- 
tute for ecological insight, appropriate environmental 


gradients, and knowledge of the processes linking en- 
vironment with biota. 
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CHAPTER 


6 


Habitat Models Based — 
on Numerical Comparisons 


K. Shawn Smallwood 


Habitat Model Types and Analysis 


Habitat analysis typically involves either a compari- 
son of used versus not used environmental elements 
or a test for disproportionate use of environmental 
elements from a measured set of available environ- 
mental elements. This chapter will focus on the latter 
approach (Table 6.1). The measure of use can in- 
clude numerical spatial patterns (distributions) based 
on an individual’s locations or on collections of indi- 
viduals (Fig. 6.1). Using estimates of density or the 
number of individuals to perform habitat analysis 
also involves the use of pattern analysis, because 
density and spatial distribution are interlinked (Tay- 
lor 1961; Taylor et al. 1978). If organisms were pat- 
terned uniformly or randomly across a landscape or 
region, then habitat analysis would not be informa- 
tive. The species would be considered as simply ubiq- 
uitous. However, species do not distribute themselves 
this way. Numbers vary spatially, and implicit in 
most habitat models is the assumption that numeri- 
cal peaks coincide with locations where habitat is of 
the highest quality (but see Van Horne 1983 for 
warnings against this assumption). 
Using densities or numbers of individuals for habi- 
tat analysis, the measurement units for the response 
side of the analysis can range in demographic organ- 
ization from subpopulations and populations to 
megapopulations (sensu Garshelis and Visser 1997) 


and metapopulations (sensu Hanski and Gilpin 
1997). These response units are measured on geo- 
graphic areas that encompass at least several home 
ranges to regions (Fig. 6.1). Many habitat models in- 
clude the term N to represent number of individuals, 
D to represent density, or some other term represent- 
ing number or spatial intensity in the environment. 
The terms N and D are explicitly represented in 
some habitat models (e.g., Table 6.1, Equations 6.1, 
6.5, 6.11, 6.12, 6.13) and are needed for frequency 
counts and ratios used in the other models. There- 
fore, habitat analysis is scale-dependent because N 
and D are scale-dependent (Verner 1981; Blackburn 
and Gaston 1996; Smallwood and Schonewald 
1996). Furthermore, these models do not specify 
which demographic unit is being represented by N or 
D, nor the contexts of demography, season, interan- 
nual variability, or condition of the landscape, all of 
which can affect N or D (Cyr 1997; Smallwood in 
press). 

Habitat models using numerical comparisons have 
been reviewed already (e.g., Morrison et al. 1998). 
Van Horne (1983) pointed out that density is a mis- 
leading indicator of habitat quality, mainly because 
density estimates are frequently too narrow in their 
representation of a population’s success. According to 
Van Horne (1983), habitat studies are often spatially 
and temporally inadequate for including areas occu- 
pied during all seasons, nor do they include the full 
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TABLE 6.1. 


Habitat models based on numbers or densities. 


Equation Number and Model Name Model Structure 


Explanation 


Measures of effect, no P-value D 
6.1. Edge index (E- T. x 10096 
(Helle and Jarvinen 1986) 


6.2. Index of selection ogame 
(Paloheimo 1979) - pi 
6.3. Index of electivity (Ivlev 1961; pe d RA 
Jacobs 1974; Gordon 1989) aud 
6.4. Usage-availability rank tj = rank r; — rank p; 
difference (Johnson 1980) 
Hypothesis tests with P-values 
6.5. Isodar Theory of habitat selection = 
é y I EL TGA mD. 
(Morris 1990) bı bı 
O-E)? 
6.6. X? test with Bonferoni Z test = ( E ) 
(Pearson 1900; Neu et al. 1974) 
6.7. Log-ikelihood ratio criterion po 2 Y 0 «log, 2 
(Fisher 1924, 1950; Everitt 1977) 
6.8. Measure of aggregation DE 
(David and Moore 1954; ge Ej (x, - x)? 
Pielou 1977) m 
6.9. Ecological order M; = o 
(Smallwood 1993) E 
6.10. Negentropy (Fisher 1924; s O 
Shannon and Weaver 1949; Rm > log. E 
j=1 
Rothstein 1951; Phipps 1981) 
h h 
6.11. Surface regression mode! D = bo + Nox, F NE 
i i 


(Rotenberry 1986) 


Th 
33:772: ; 
i j 


6.12. Correlation (Wiens and d = ah; 
Rotenberry 1981) 


6.13. Patch shape/Type (Otis 1998) In[E(D;)) = Bo +In(A;) 
+B, (Sj) + > BP i 


D; and D; are densities at sites / and j 


r,is the percentage of the ith element used, 
and p; is the percentage available in the 
measured set 


in the ith habitat, a; = maximum fitness, 
bj = per capita reduction in fitness 


O = observed, and E = p; * N = expected 
number of observations 


X; = number of occurrences in ith element; 


. X = mean of occurrences among n elements 


M; = the number observed as a multiple of 
the number expected in the ith element 


Use of information; use of the measured set 
of choices; deviation from uniform association 
indicates negative entropy 


X; = the ith habitat, h is the number of habitat 
variables, and bs are fitted coefficients 


In a strip transect, h; is the ith habitat 
attribute expressed as a continuous variable, 
e.g., percent cover 


Aj — area of the jth patch in the ith 
environmental element; S = shape, which is a 
function of the patch perimeter and its area; 
P = patch perimeter 
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Figure 6.1. Numerical data in habitat models can include indi- 
vidual locations within a home range or any number of individu- 
als with or without reference to a demographic unit. The grain of 
the measured set of environmental elements typically grades 
from micro- to macro-habitat representation as the numerical 
data grades from individual locations to regional in scope. 


range of numerical responses to environmental 
changes, nor do they address the social interactions 
leading to occupation of both source and sink areas. 
Vickery et al. (1992) found density to be a poor pre- 
dictor of nest success among three species of passerine 
bird species, although they did not account for the ef- 
fect their variable plot sizes had on density (Small- 
wood 1999). Nevertheless, unburdened by any possi- 
ble effect of variable plot sizes on density, Morris 
(1989) found mean litter size of the white-footed 
mouse (Peromyscus leucopus) to decline with increas- 
ing density. Alldredge et al. (1998) discussed faulty as- 
sumptions underlying various habitat models, such as 
the assumption that the sample of animals is random 
(rather than belonging or not belonging to a popula- 
tion), that each observation is independent of the next, 
that each animal's habitat selection was independent 
of others in the sample, that the availability of envi- 
ronmental elements is known rather than estimated, 
that availability is constant over the period of the 
study (rather than changing seasonally or for other 
reasons), and that detectability of animals is constant 
among the environmental elements sampled. Small- 
wood (1993) began a discussion on the theoretical 
foundations of these models, focusing on the role of 
thermodynamics in resource selection models. 

In this chapter, I will frame habitat analysis within 
the larger context of world-views on how and why an- 
imals distribute themselves and how and why analysts 
apply experimental design principles and statistical 
tests to habitat studies based on relative number or 
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t 
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-a N 


Measurements | Pattern of abundance X Incidence map ———>> Habitat association 
and comparison (selection, preference) 


Figure 6.2. The effectiveness of numerically based habitat 
models and the measurements of use and availability ulti- 
mately depend on the attributes of the field studies and the 
world-views of the analyst, which have primacy over the actual 
measurements of use and availability. 


density. Figure 6.2 depicts a hierarchical framework 
for discussing habitat analysis, first considering world- 
views (paradigms), then attributes of study and inter- 
pretive design, and finally the measurements and com- 
parisons, the results and interpretation of which 
depend on world-views, and study and interpretive de- 
sign. I intend to demonstrate that the available habitat 
models are inadequate by themselves for characterizing 
habitat of animal species; these models are no substi- 
tute for long-term research experience, although their 
predictions can be used as testable hypotheses. Just as 
habitat analysts need operational terms with which to 
work (Morrison and Hall, Chapter 2), they also need a 
larger framework with which to interpret applications 
of the habitat models based on numerical comparisons. 


The Theoretical Null Pattern 


The significance of a measured numerical pattern (spa- 
tial distribution) to habitat use is decided largely by its 
comparison to a theoretical null pattern. Measured nu- 
merical patterns can be uniform, random, aggregated, 
or regular, and null patterns can be uniform, random, 
or both (Fig. 6.3). Deviation of a measured pattern 
from the null pattern is assumed to be caused by a re- 
lationship between the species and the energy or mate- 
rial resources composing the environmental elements 
(Smallwood 1993). The theoretical foundations of the 
null pattern are therefore critical for interpreting the 
association. The null pattern in thermodynamics is uni- 
formity, which connotes the equilibrium cold state, or 
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Figure 6.3. Random patterns occur somewhere along a gradi- 
ent ranging from uniformity (regularity) to aggregation, thereby 
reducing the distance that can be measured between an ob- 
served aggregated pattern and the null pattern of randomness. 


lack of energy relationships (Hutchinson 1953). The 
null pattern in information theory is also uniformity, 
which connotes no use of the available information 
(Shannon and Weaver 1949; Kullback 1959; Phipps 
1981). Uniformity is the state of maximum entropy in 
both thermodynamics and information theory. The 
null pattern in ecology is randomness, which connotes 
lack of influence from the locations of other individu- 
als (Fisher 1950; Taylor 1961). The null pattern chosen 
by the habitat analyst identifies the world-view of the 
analyst regarding how and why nature is organized, 
and it determines the set of alternative results. 

Habitat analysts usually explicitly or implicitly as- 
sume that spatial distributions are determined by 
some form of energy relationship(s) between the or- 
ganisms and their environment (Hall et al. 1997b; 
Morrison et al. 1998). Although rarely discussed ex- 
plicitly, many ecologists hold the world-view that bio- 
logical processes are governed ultimately by the laws 
of thermodynamics (Hutchinson 1953). The measures 
of effect used in Equations 6.6-6.10 are derived from 
these laws, and all the other models in Table 6.1 (ex- 
cept Equation. 6.5) are theoretically and mathemati- 
cally related to Equations 6.6-6.10 (Smallwood 
1993). Many aspects of habitat are assumed to be en- 
ergy-related, and the goals of wildlife are usually as- 
sumed to be resource acquisition and total fitness 
(Southwood 1977; Rosenzweig 1985; Wiens 1989b,c), 
which are energy-related and energy-dependent, re- 
spectively. Habitat elements other than energy are rec- 
ognized as important, such as water, nutrients, refu- 
gia, travel corridors, nest sites, and so on. However, 


even the response variable, density, has its conceptual 
origins in the physical sciences and is energy- 
dependent. High-quality habitat is said to have greater 
levels of energy available (Hall et al. 1997b). 

However, if the laws of thermodynamics largely gov- 
ern the spatial distributions of animals, then assuming 
randomness as the null pattern is not only inappropri- 
ate on theoretical grounds but also reduces the analyst's 
capacity to recognize meaningful aggregations and se- 
lection of environmental elements (Smallwood 1993). 
Random patterns occur somewhere along a gradient 
that ranges from uniformity (regularity) to aggregation 
(Fig. 6.3). This gradient is represented by probability 
values of the %2 distribution, the lower-tail values of 
which correspond with uniformity and increasingly 
upper-tail values grading through randomness and 
eventually aggregated patterns. The distance that can be 
measured between an observed aggregated pattern and 
the null pattern of randomness is less than between the 
aggregated pattern and the null pattern of uniformity. 
Therefore, assuming the null pattern is random instead 
of uniform washes out some of the potential signifi- 
cance that can be attributed to an observed aggregated 
pattern. The world-view of the analyst is at least implic- 
itly expressed by the types of habitat model and statisti- 
cal test used, because these models and tests are struc- 
tured either on the assumption that the null pattern is 
uniform (e.g., x2 tests) or random (e.g., ANOVA tests). 

Only aggregated patterns of animals can lead to in- 
ferences about habitat use. Purely uniform or random 
patterns of use will fail to reveal any selection of envi- 
ronmental elements from a measured set because no 
disproportional patterns will be evident (Fig. 6.4). A 
regular pattern of distribution might indicate that the 
study area included territories of individuals number- 
ing fewer than occurred in the larger population (Fig. 
6.5). In other words, the study area was smaller than 
the area occupied by the population, which often 
arranges itself into regularly spaced home ranges or 
territories held by its members. Similarly, random pat- 
terns have been observed when progeny (of insects) 
dispersed passively in air or water (Taylor et al. 1978), 
or when occurrence was scarce or at the edge of a 
larger aggregation (Fig. 6.5). 

Recognizing whether the spatial pattern is uniform, 
random, aggregated, or regular depends on both the 
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Figure 6.4. To interpret meaningful associations, patterns of 
distribution are compared to a map of the measured incidence 
of environmental elements, composing a measured set. The 
meaning of an association ultimately depends on the world- 
view of the analyst and on the theoretical foundations of the 
statistical test or measure of effect being used. It also de- 
pends on the type of aggregations that are being compared to 
the measured set of available environmental elements. 


spatial scale of observation and knowledge of the 
species’ demographic organization. Examining the 
spatial relations of individuals from a portion of a 
population can mislead an analyst into thinking the 
pattern of distribution is random or uniform, when it 
is more likely regular at the scale appropriate for 
measuring the population and aggregated at the re- 
gional scale. For example, prairie dog (Cynomys spp.) 
burrows occur fairly regularly within colonies (Tile- 
ston and Lechleitner 1966), which are themselves ag- 
gregations in the region (Koford 1958). Of course, in- 
formation about the spatial distribution is lost when 
transforming the locations of individuals into numeri- 
cal estimates and then to densities. Comparing N or D 
for habitat analysis cannot lead to conclusions about 
demographic organization unless the spatial areas are 
included in the comparison (Smallwood 1999). 


Types of Aggregation 


Aggregations can form for several reasons (Fig. 6.6), 


and each of these reasons poses a unique implication 
for interpreting the habitat association. These reasons 
bear on the choice of the theoretical null pattern and 
the habitat model. It will be important for wildlife bi- 
ologists to determine the relative occurrence frequen- 


cies of each of the following four types of aggregation: 
resource, demographic, early-stage, and constrained. 
Individuals may aggregate around a centralized or 
patchy resource, forming a resource aggregation. 
Competition for a limited resource can force some an- 
imals to live on the fringe of the resource patch, or 
even to spill over into ecological sinks. Such an aggre- 
gation can extend beyond the boundary of the re- 
source patch just because off-patch individuals get suf- 
ficient access to the resource or because those 
exploiting the resource generate progeny that disperse 
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Figure 6.5. The size and position of study areas inform of dif- 
ferent spatial patterns among burrow systems of California 
ground squirrels (Spermophilus beecheyi. Using the pacing 
method for mapping burrow systems of pocket gophers (Small- 
wood and Erickson 1995), | mapped the approximate centers 
of ground squirrel burrow systems on 66.7 hectares of annual 
grassland in the low-elevation foothills north of Fresno, Califor- 
nia, during April and May 2000. A nearest-neighbor-distance 
method (Clark and Evans 1954) indicated that ground squirrel 
burrow systems were regularly distributed within a larger ag- 
gregation and randomly distributed at the edge of the aggrega- 
tion. The larger aggregation was only recognizable within the 
boundary of the largest (66.6 hectares) study area. The den- 
sity of burrow systems was also greater within the aggregation 
compared to the edge. 


R = Index of aggregation 
z = standard normal deviate 
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Figure 6.6. The types of aggregations observed largely deter- 
mine the pattern of distribution and the association with the 
measured set of environmental elements. Demographic, re- 
source patchiness, habitat fragmentation, and early-stage fill- 
ing of ecological space can affect our perception of use and 
availability. 


outward, beyond the patch. In either case, and in the 
case of other possible explanations, the number of in- 
dividuals associated with the resource-based aggrega- 
tion need not be bound within the resource patch. 

Individuals also aggregate with conspecifics to im- 
prove fitness through reproduction, rearing of young, 
and group cooperation such as foraging and predator 
avoidance. They may choose places to live based on 
the momentum of the congregation rather than the en- 
ergetic or nutrient accessibility of the habitat patch 
(Alatalo et al. 1985). Demographic aggregations may 
have spatial limits that are imposed by behaviors or by 
uncertainties of habitat quality outside the aggrega- 
tion (Stamps 1991). Demographic aggregations con- 
ceivably could be confined to space that is smaller in 
extent than the resource patch (Taylor and Taylor 
1979). Social constraints could therefore limit our ob- 
served use of an environmental element (resource) to 
less than that expected based on its availability. Das- 
gupta and Alldredge (1998) devised a behavioral de- 
pendency parameter for use with %2 tests, involving 
situations where multiple individuals are observed to- 
gether. However, aggregations can occur without indi- 
viduals being observed together, per se (e.g., Puma 
concolor, Smallwood 1997), but which may have 
formed to serve social and demographic needs never- 
theless (Lloyd 1967). 

An early-stage aggregation can also form while im- 
migrants invade previously unoccupied habitat, which 
could have been cleared of former occupants, or 


which was recently discovered or became available to 
the species. An invasion of pocket gophers (Tho- 
momys spp.) into new stands of alfalfa (Smallwood 
and Geng 1997) is a good example. Early-stage aggre- 
gations are combined resource and demographic ag- 
gregations but temporarily give different results when 
analyzing habitat selection based on use and availabil- 
ity methods. Such aggregations provide the best op- 
portunity to observe preference of the available envi- 
ronmental elements because they are relatively free of 
demographic constraints (e.g., competition and terri- 
toriality). However, as numbers increase due to con- 
tinued immigration, social mechanisms such as territo- 
riality force later arrivals into less-preferred locations, 
thereby affecting our perception of use (Fig. 6.7). An 
example of early-stage aggregations can be found in 
“Intrapopulation Numbers,” later in this chapter. 
Constrained aggregations result from habitat frag- 
mentation (sensu Wilcox and Murphy 1985) or the 
division of previously contiguous habitat patches (Ad- 
dicott et al. 1987). They may also result from intoler- 
able environmental conditions occurring naturally 
outside the aggregation, including the occurrence of 
competitors or predators (Hutchinson 1953; Koford 
1958). Patches of low-quality habitat are occupied by 
constrained aggregations because the species is left 
with no other place to go (du Toit 1995). Low-quality 
habitat can appear to be high-quality habitat when 
loss of habitat constrains individuals to fragmented 
habitat patches or to peripheral habitat areas. Habitat 
analysis using constrained aggregations can be mis- 
leading due to packing of individuals into the habitat 
fragment or the inclusion of the intolerable areas in 
the measured set of environmental elements. Unfortu- 
nately, many special-status rare species are found only 
in fragmented habitat, or in habitat at the boundary of 
their tolerable conditions (e.g., Scott et al. 1986), so 
their constrained aggregations should not be used 
alone to interpret habitat associations. For example, 
giant garter snakes (Thamnophis gigas, federally 
endangered) occur in marshes and adjacent irrigation 
canals where marshes in the Central Valley of Califor- 
nia have not been converted entirely to agricultural 
fields and houses. The study of historical data and tax- 
onomically and functionally similar species might help 
investigators interpret habitat associations for special- 
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Figure 6.7. The relationship between selection for the field 
edge and pocket gopher (Thomomys bottae) density across al- 
falfa fields (Medicago sativa). The observed value divided by 
the expected value of 1.0 indicates the edge was used in pro- 
portion to its availability. 


status rare species. For example, habitat descriptions 
of T. sirtalis, T. couchi, and T. elegans might provide 
insight into the historical habitat of T. gigas. 

The prevailing view among those using the models 
in Table 6.1 appears to be that resource aggregations 
result in our measurements of disproportionate use of 
environmental elements. For example, Rotenberry et 
al. (Chapter 22) present a habitat selection model for 
which they assume that most wildlife observations are 
made where these animals want to be—where they de- 
rive some benefit from resources. However, the models 
in Table 6.1, and that of Rotenberry et al., can yield 
the same measures of disproportionate use when ap- 
plied to aggregations that were forced by habitat frag- 
mentation or the need to congregate. Applied to T. 
gigas, these models will indicate that T. gigas selects ir- 
rigation canals, probably because the landscape matrix 
is annual field crops where the snake cannot live. The 
models in Table 6.1 will not discriminate among aggre- 
gations influenced by various disparate factors, so they 
cannot be relied upon to measure habitat quality. 


Demographic Organization 


The demographic units represented by the numbers 
compared also bear on habitat analyses (Van Horne 
1983). Whether or not demographic organization can 
be related to use of environmental elements, it affects 
the pattern of distribution observed. I recently discov- 


ered a range in size of study areas in which the num- 
ber of individuals might be given some meaning in 
terms of demographic organization (Smallwood 
1999). As I had also found for mammalian carnivora 
(Smallwood 1999), nearly half the numerical estimates 
of northern goshawk (Accipiter gentilis) increased 
proportionally with increases in the sizes of their cor- 
responding study areas (scale domain A, Fig. 6.8). 
Given that species of carnivora and northern goshawk 
typically space themselves at fairly even distances due 
to territory maintenance, I interpreted the aforemen- 
tioned pattern as an indication that these estimates 
were made at areas smaller than those occupied by the 
“population.” Each species in Smallwood (1999) ap- 
peared to have what I termed a threshold area, at 
which the number of individuals ceased to increase 
proportionally with increasing study area size (as in 
Fig. 6.8). The threshold area was the low end of a spa- 
tial scale domain (B), in which the number of individ- 
uals varied considerably but did not regress on study 
area size with a slope significantly different from zero. 
I interpreted the numerical estimates within scale do- 
main B to represent populations because no other de- 
mographic units have been defined for the clusters of 
twenty-five to sixty adults that are typical of this scale 
domain (Smallwood 1999). 

Regardless of whether the estimates in scale domain 
B represented distinct populations, the theoretical 
foundation of animal ecology includes the population 
as a key demographic unit with functional, goal- 
directed significance. Odum (1959) and Dasmann 
(1981) defined a population as some collection of or- 
ganisms of the same species occupying a particular 
space and sharing a suite of attributes representing a 
unique organizational structure. Yet our use of N 
(number of individuals) and D (density) in habitat and 
other analyses is usually given no meaning with re- 
spect to the population concept. Usually, no demo- 
graphic unit is attributed to N or D in habitat models. 
In conducting habitat analysis using a measured set of 
environmental elements, what does it mean to com- 
pare three individuals in element X to fifty in element 
Y and to four hundred in element Z, when the three 
individuals are a small portion of a population, the 
fifty compose an entire population, and the four hun- 
dred are from six populations? The three in element X 
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Figure 6.8. Based on the published estimates (summarized in 
Smallwood 1998), the number of nesting pairs of northern 
goshawks (Accipiter gentilis) increases proportionally with in- 
creasing size of study area until the study area is at least 100 
Square kilometers (scale domain A). Study areas larger than 
100 square kilometers include aggregations that no longer in- 
crease with increasing study area size (scale domain B). 


may occur in an ecological sink after having been 
forced there by the fifty, which themselves occur in el- 
ement Y out of proportion to Y's availability but well 
within the space needed to support a population. 

That the number of individuals in spatial scale do- 
main B did not correlate with study area size also re- 
vealed a mathematical artifact in measuring animal 
density (Smallwood 1999). The number of individuals 
in domain B is relatively similar in magnitude to those 
in larger-area domains (see Smallwood 1999). Re- 
gressing density on its corresponding study area size 
will force a slope of -1 when density was calculated 
by dividing a constant number by a variable area. Sim- 
ilarly, regressing density on its corresponding study 
area size will force a slope of 0 when density was cal- 
culated using the numbers that increased proportion- 
ally with study area size, such as those in scale domain 
A and those forming the transitions between scale do- 
mains B and larger. Therefore, density appears to be a 
continuous variable only in the absence of demo- 
graphic organization, and its discontinuity can con- 
found habitat analyses based on the models in Table 
Gul: 

Spatial scale (extent) domains of distribution pro- 


vide habitat analysts with a means to attribute social 
meaning to the frequency counts used in the habitat 
models. These domains revealed a possible demo- 
graphic context against which frequency data can be 
compared. I lack the directed field research evidence 
needed to conclude that the numerical estimates in 
spatial scale domain B were distinct populations. 
However, the population concept can still be used to 
discuss the intent of habitat analysts and the meaning 
of their results. 


Home Range 


Habitat analysis within the individual’s home range 
fails to reveal the significance of the location of the 
home range. Differential use of areas within the indi- 
vidual’s home range is interesting at the microhabitat 
level, but the location of the home range itself can be 
influenced by the location of the population and the 
social and demographic status of the individual (Van 
Horne 1983). An individual’s home range can encom- 
pass low-quality habitat simply because it was forced 
to live there in order to participate as a member of the 
population (Stamps 1991). Leger et al. (1983) found 
that California ground squirrels (Spermophilus 
beecheyi) adjust their behaviors to access microhabi- 
tats that would otherwise pose increased risks of pre- 
dation. Animals do not always go to a place because it 
provides the highest-quality habitat, but once at the 
place, they make the best of it. Reports of habitat 
studies based on telemetry locations of individuals 
often provide no information of the demographic sta- 
tus of the individual or of the individual’s spatial rela- 
tionship to the population. 


Intrapopulation Numbers 


Comparing numerical estimates made at the subpopu- 
lation level (scale domain A) can confound the habitat 
analysis with the proportional relationship between 
the estimated number of individuals and study area 
size (Smallwood 1999), unless study sites (plots) were 
chosen randomly or systematically from a region, used 
and unused sites were compared, or long-term obser- 
vations were used. Habitat analysis at this numerical 
level risks observing proportional patterns of use sim- 
ply because individuals or their home ranges are regu- 
larly distributed among microhabitat elements due to 
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territoriality or other social interactions that parti- 
tioned the resources at this level. Habitat analysis at 
this level of organization may or may not contribute 
useful inference regarding habitat, as social interac- 
tions are driving the spatial distribution within a 
larger area of habitat. Unless home range size can be 
related to habitat quality, habitat identified from a 
home range or from within the bounds of a popula- 
tion was pseudoreplicated (Hurlbert 1984; Aebischer 
etal, 1993). 

As an illustration of apparent habitat use based on 
intra-population numbers, I compared a measure of 
resource selection (Equation 6.9 in Table 6.1) to 
Botta's pocket gopher (Thomomys bottae) density to 
test whether measured selection for the edge of alfalfa 
(Medicago sativa L.) fields was density-dependent: 


Observed 
gopbers at edge " gophers along edge 
Expected — ha along edge 
h ld | ————5—$— 
gopbers at edge sini sal ba of field 


The density estimates were from 134 counts of bur- 
rows across thirty-seven alfalfa fields in Yolo County, 
California, during 1992 through 1994 (Smallwood 
and Geng 1997). 

At low density, gophers occurred along the field 
edge at nearly five times the number expected by 
chance (Fig. 6.7). However, increasing density reduced 
my ability to quantitatively represent gopher selection 
for the field edge (r? = 0.71, Root MSE = 0.49, P < 
0.001): 


Observed gopbers at edge 
Expected gopbers at edge 


= 3.09 —1.43 log Density 


The obvious preference for the edge grew increas- 
ingly hidden as gophers invaded the interior of the 
field and approached a regular distribution due to sat- 
uration of territorial space, represented by a ratio of 
1.0 between the observed and expected gophers at the 
field edge. I do not think that gophers favored the 
edge any less as density increased, but rather their 
preference for the edge grew increasingly less recog- 
nizable as the overall density in the field increased 
(Fig. 6.7). As gophers saturated the field interior, the 
clustering along the edge became hidden, just as 


Hansen and Remmenga (1961) found a clustered go- 
pher distribution to vanish as the increasing density 
shifted the gopher population to a regular distribu- 
tion. This density-dependence of distribution was rec- 
ognized before (Taylor et al. 1978), but the density- 
dependence of measuring selection preference or 
avoidance will require habitat analysts to reconsider 
current reliance on measures and statistical tests based 
on use and availability of resources. Measured selec- 
tion of environmental elements will be density- and 
therefore area- and time-dependent (also see Small- 
wood 1995), just as information theoretical measures 
(e.g., H', the composite index of species richness, 
log2S and evenness, J) are dependent on sample size 
(Cousins 1977) or on the spatial scale at which these 
measurements are derived. 

This density- or scale-dependence also bears on 
the timing of habitat analysis. Measures of resource 
use will vary depending on the stage of ecological 
succession of a site or the stage at which a site is 
being colonized. Preference can be clearly observed 
during the early stages of colonization, but during 
the later stages the preferred resources can be hidden 
by overflow of individuals into relatively less-pre- 
ferred conditions. 


Interpopulation Numbers 


Comparing numbers (densities) between populations 
poses two potential obstacles to recognizing true pat- 
terns of habitat association. The spatial shifting of ag- 
gregations summarized by Taylor and Taylor (1979) 
involves clustering at a subset of the available high- 
quality habitat patches. Taylor and Taylor (1979) pro- 
posed four hypotheses to explain the frequently ob- 
served spatial shifting of aggregations: (1) populations 
must move once they deplete their most limiting re- 
sources; (2) population members shift locations in- 
nately so as to prevent the exhaustion of resources; (3) 
dispersal and territory establishment of the next gen- 
eration also establishes the location of the next aggre- 
gation, while the previous aggregation senesces; and, 
(4) a combination of hypotheses 1-3. Occupied habi- 
tat patches are either not different from many of those 
that are unoccupied at the time of the analysis or they 
are ephemeral in their quality. In the latter case, the re- 
sults of habitat analysis are ephemeral. In the former 
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case, habitat analysis might measure use as less 
than availability, and a strict adherence to the numeri- 
cal comparison might mislead the investigator to 
conclude that the occupied habitat patches are less 
preferred. 

Today’s absence at a site can be tomorrow’s pres- 
ence, as the spatial distribution is unique for each gen- 
eration (Taylor and Taylor 1979). Habitat measured 
at time £, can represent the species’ habitat at times tz 
and 15 only if the same generation and the same envi- 
ronmental conditions span times 7, through 15. If the 
species shifted locations between generations, as is 
common according to Taylor and Taylor (1979) and 
den Boer (1981), or if environmental conditions 
changed between times t4, t2, and t3, then the analysis 
at time 7, will be inadequate and misleading if the goal 
is to describe the species’ habitat; it will contribute to 
a belief that the species relies on a narrower range of 
environmental conditions than it actually does. The 
exception will be constrained aggregations, which will 
be at the same locations at all times, so long as none 
of the aggregations go extinct. However, habitat 
analyses using constrained aggregations are predis- 
posed to mislead simply due to the constraints on 
availability of suitable environmental elements. 

An additional obstacle is the meaning we attach to 
numerical variations of each population. Population 
number can vary greatly, with increases and decreases 
lagging behind changes in conditions of resources. In- 
terannual variability in N typically cannot be charac- 
terized adequately until monitoring has been con- 
ducted spanning multiple generations (Cyr 1997). 
Therefore, when comparing N or D from population 
X to that of population Y, and N or D differs, can we 
really conclude that habitat quality differs between the 
sites occupied by populations X and Y? 

Presumably, sociality and energy and material re- 
sources maintain a carrying capacity just below which 
exists an optimal density. Within the spatial bounds of 
a population, density can vary somewhat but must be 
constrained by territoriality or the habitat element in 
most limited supply. Comparing locations occupied by 
populations, density might be nearly the same within 
the bounds of each population (Smallwood 1999). 
Comparing density within the spatial bounds of a 
population or between high-density populations is un- 


likely to reveal meaningful differences and is therefore 
relatively uninteresting. In other words, density needs 
to vary considerably to reveal possible differences in 
habitat quality, but intrapopulation regularity of dis- 
tribution and mutually high-density populations will 
not vary sufficiently in N or D to provide inference re- 
garding habitat selection. 

The terms N and D in habitat models can be repre- 
sentative of whole populations, and the occurrence 
frequency or spatial areas occupied by these popula- 
tions compared to the measured set of environmental 
elements within a region or the species’ range. The N 
or D representing a whole population can simply be 
replaced by a 1 to denote presence or a 0 to denote ab- 
sence, and little information would be lost so long as 
the populations are discrete and easily bounded. Rep- 
resenting whole populations, presence or absence 
might be just as informative as N or D. If this step 
were taken, then the boundary of the area that is oc- 
cupied by the population would need to be identified. 
The area occupied by the population might be more 
reliable as a measure of use than either N or D. 


Megapopulation or Regional Numbers 
Demographic organization is again implicitly ignored 
by habitat models that compare densities to measure 
use and availability of elements between regions. Re- 
gions are usually large-enough areas to include multi- 
ple populations. Garshelis and Visser (1997) termed 
the collective abundance from regions as megapopula- 
tions because they did not know what else to call it. 
Metapopulations, on the other hand, theoretically or- 
ganize within regions (Hanski and Gilpin 1997), but 
empirical evidence is lacking for knowing the bounds 
of a metapopulation or how many populations might 
compose a metapopulation. Regional patterns of dis- 
tribution are still largely theoretical, and the term 
megapopulation indicates that this theory is weak. 
Comparing densities or use of regions poses the ad- 
ditional problem of comparing the availability of dis- 
parate measured sets of environmental elements 
(Wiens and Rotenberry 1981b). For example, Small- 
wood and Fitzhugh (1995) compared the number of 
puma track sets to represent use of available vegeta- 
tion complexes and topographic categories across all 
the sampled areas of California, which included areas 
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spanning the full north-south and east-west extents of 
the state within the species’ range. The topographic 
categories were common across the state, consisting of 
ridges, mountain peaks, basins, canyons, and so on. 
These macrohabitat categories were fairly comparable 
across California, much like Anderson's (1981) use of 
vegetation structural elements across the United 
States. However, sage-juniper forests are limited to 
north-central and eastern California, and chaparrals 
occur more to the west; they are not interspersed. 
Comparing use to availability of vegetation categories 
seemed uninformative on a statewide level, especially 
when puma populations occurred in both sage-juniper 
and chaparral. Knowing the range of environments 
used by pumas in California is useful, and knowing 
that pumas are more rare in the Mojave Desert than in 
the Klamath Mountains is also useful, but habitat 
analysis involving use and availability comparisons is 
probably best conducted within regions, within 
which key elements are more likely to be naturally 
interspersed. 


Gradient of Abundance 


Animal density is often represented with contours, 
which illustrate relatively smooth gradations in density 
from high to low across landscapes or regions (e.g., 
Taylor and Taylor 1977; Wiens and Rotenberry 1981b; 
Cody 1985a,b; Scott et al. 1986; Root 1988a,b,c; Price 
et al. 1995; Morrison et al. 1998). These gradients are 
mathematically derived using averaging and interpola- 
tion, whereas the actual spatial distribution may not be 
smoothly graded. This transformation of field observa- 
tions into density contours expresses a world-view of 
distribution that acknowledges aggregation as the 
norm for animal species but also facilitates the poten- 
tially erroneous idea that animal density corresponds 
to habitat quality. Habitat quality is assumed to be en- 
ergy-related, as discussed previously, so it is related to 
the world-view that the laws of thermodynamics gov- 
ern the spatial patterning of animal species, also previ- 
ously discussed. However, this world-view, that density 
grades smoothly across the landscape, can be sup- 
ported by few examples, just as sharply bounded ag- 
gregations can be supported by few examples (Morri- 
son et al. 1998). It may be that spatial distributions of 


animals can be better represented by categories, includ- 
ing population occurrence, trace activity (i.e., a few in- 
dividuals in a large area), and absence. Lidicker (1995) 
also summarized categorizations of habitat quality, 
which were thought to bear directly on densities. Habi- 
tat analysts need to test whether animal aggregations 
are discrete or graded, whether population boundaries 
can be identified, and whether densities are categorical 
in correspondence with habitat quality. Empirical evi- 
dence should be the foundation of theory, and theory 
that bears on gradients of abundance also bears on 
habitat analysis. 

It is well documented that individuals can occur in 
what are regarded as ecological sinks for the species. 
In fact, numerical estimates in ecological sinks can al- 
ways be greater than zero just because dispersing ani- 
mals are forced into the sink where the rates of re- 
cruitment outpace mortality. Habitat models based on 
numerical comparison will thus give ecological sinks 
some positive habitat value in such cases, whereas the 
true value should be negative (less than zero) with re- 
spect to the functionality of the population. Density 
estimates alone cannot inform of sink conditions 
(Lidicker 1975, Van Horne 1983), because density es- 
timates cannot be negative—they range from zero to 
some presumed carrying capacity. Learning of sink 
conditions requires intensive study, making use of 
more information than numerical estimates (e.g., 
Morris 1989). 


Preconception of Habitat 


The map of available environmental elements is made 
by the habitat analyst. Some a priori notion of habitat 
inevitably goes into construction of the incidence map 
and will affect the typology of the map and its spatial 
grain (Austin, Chapter 5). For example, Smallwood 
and Fitzhugh (1995) used typical home range size of 
puma to decide on the quadrat size and transect 
lengths for counting track sets in California. The vol- 
unteer biologists who selected the exact locations of 
the transect segments believed that roads along 
ridgetops would produce more tracks. We later 
learned that roads along streams were most produc- 
tive. If we were to start a new sampling program for 
puma track sets, it would differ markedly from the 
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1985 design, and the results would likely differ as 
well. 

Preconceived notions of habitat influence the 
study's location, spatial extent, grain of mapping, and 
ultimately the replication and interspersion of the en- 
vironmental elements in the map. Habitat analysts 
often collect the use and availability data from loca- 
tions where the species was known to occur, often in 
abundance. By siting habitat studies this way, analysts 
may often force results that conform to preconceived 
notions of habitat. Experimental design principles 
should be applied to mensurative as well as to manip- 
ulative studies (Smallwood 1993). The spatial extent 
and grain of the study should be appropriate to meas- 
uring the species' differential use of the environment. 
Replication and interspersion of treatments (i.e., envi- 
ronmental elements considered to be available to the 
species) also must be incorporated into mensurative 
studies (also see Otis 1998; Austin, Chapter 5). Along 
a transect or within a defined study area, replication 
of available elements is achieved through multiple oc- 
currence of each environmental element in the meas- 
ured set of elements (e.g., vegetation complexes). 
Those elements occurring once are pseudoreplicated 
(sensu Hurlbert 1984) and association of the species 
with that element should be considered dubious pend- 
ing further research (e.g., environmental elements X 
and Z in Fig. 6.4 are pseudoreplicated, while Y is 
measured as two replicates). Interspersion is achieved 
by each element's occurrence between other elements 
in the measured set, such that gradient effects do not 
cause spurious relationships between measured use 
and availability. Thus, replication and interspersion of 
various environmental elements can be achieved when 
the program of observation includes two or more 
patches of each environmental element within the 
study area or along the transect. 

Given that the analyst is constrained by observing 
extant environmental conditions, the observed use of 
the environment by an individual, population, or 
larger social unit need not reflect the species? percep- 
tion of what constitutes habitat or high-quality habi- 
tat (Morrison et al. 1998). Much of what analysts 
may perceive as a numerical response to extant envi- 
ronmental conditions actually may be responses to the 
combination of both relic and current habitat and de- 


mographic conditions. Habitat models implicitly as- 
sume that deviations from the theoretical null patterns 
of distribution are immediate responses of individuals 
to resources that are available in the measured space. 
However, demographic organization can represent a 
long-term response to energy and material resources 
in the environment experienced by the species, and it 
can pose much of the information useful to the indi- 
viduals. If a suite of environmental and demographic 
conditions typified the success of the species over long 
time periods, then the species likely developed percep- 
tions of habitat and demographic organization that 
are relatively reliable evolutionarily. Such stability 
could be achieved by building the responsible neurons 
early in ontogeny (Coss and Goldthwaite 1995). How 
such perceptions would be manifested in the changed 
landscapes of modern times can affect our interpreta- 
tions of habitat. 

Habitat analysis is bound to include plenty of noise 
caused by conserved use of information from relic en- 
vironments, which today may be missing from the 
studied environment. For example, after Hutto (1990) 
experimentally reduced food supplies from under the 
bark of specific trees, he found no difference in visits 
per tree or time spent at the tree by birds that depend 
on this food supply during the winter months in the 
boreal forest. In another example, the carrying capac- 
ity of pocket gophers in alfalfa fields is determined not 
by the food source, which is plentiful, but by the home 
range size. The home range size was established by 
natural selection in past environments where food 
supplies were typically more limited. Pocket gophers 
in alfalfa are so constrained by evolutionarily designed 
perceptions of space that they cannot fully exploit al- 
falfa stands, which are perhaps the most abundant 
food source these animals have ever encountered. 
Management or policy decisions should acknowledge 
a reasonable level of uncertainty in habitat analysis or 
assessment and should be very conservative. 


Conclusions 


Comparing numerical terms for use and availability: 
cannot reliably characterize habitat without consider- 
ing the environmental and demographic contexts of 
the numbers. The measurements and resulting associa- 
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tions are determined largely by the attributes of the 
field study design, as well as those of the analytical, in- 
terpretive design. These attributes are themselves de- 
termined largely by the analyst's preconception of 
habitat, demographic organization of the species, how 
the species is patterned spatially and temporally, and 
the reasons for animals to congregate or associate 
with environmental elements out of proportion to the 
element's availability (Fig. 6.2). At our current state of 
knowledge, world-views may bear more heavily on 
habitat analysis than does the procedure for measur- 
ing occurrence frequencies in environmental elements. 
Some of these world-views can be tested as hypotheses 
by directed field research. For example, how often do 
animals aggregate for reasons that do not bear directly 
on the availability of a food resource? Are all aggrega- 
tions truly populations? How sharply bounded are an- 
imal populations? Answers to these and other related 
questions can provide habitat analysts with meaning 
for numerical comparisons that the models do not 
provide. 

The effectiveness of habitat analysis based on nu- 
merical comparisons is largely dependent on knowing 
how and why animals distribute themselves. The mod- 
els in Table 6.1 will be prone to inappropriate applica- 
tion and erroneous interpretation until analysts largely 
agree on the extent to which animal species respond 
numerically to energy availability, information pre- 


sented by the past and current environments, or to 
predators and competitors. Each model has a theoreti- 
cal root or history, and we need to decide whether the 
theoretical foundation is consistent with what we are 
attempting to measure and how we should interpret 
the resulting patterns. 

Van Horne's (1983) argument that density is a 
misleading indicator of habitat quality remains valid, 
and based on recent research on density, it is all the 
more clear that numerical comparisons are currently 
of dubious utility to habitat analysis. Density esti- 
mates are sensitive to whether they are derived from 
population *isolates" or from sampling of the statis- 
tical universe (Preston 19622), and they are sensitive 
to the size of the study area examined (Smallwood 
and Schonewald 1996). Until much more basic re- 
search on animal distribution has been conducted, 
the models in Table 6.1 should be used cautiously be- 
cause their use can translate into inappropriate man- 
agement decisions, sometimes with possible dire con- 
sequences for the species. 
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The Role of Category Definition in 
Habitat Models: Practical and Logical 
Limitations of Using Boolean, Indexed, 
Probabilistic, and Fuzzy Categories 


Kristina E. Hill and Michael W. Binford 


he use of categories is a fundamental part of 

human reasoning (Rosch 1975; Lakoff 1987). 
Conceptual categories are essential for the articulation 
of biological theories. The biological species itself is a 
category with an interesting history of conflicting defi- 
nitions, as are the notion of habitat and theories of the 
niche (Mayr 1982). Set theory is the branch of mathe- 
matics that deals primarily with numerical descrip- 
tions of membership in categories, which allows us to 
represent categories in logical statements (Ayyub and 
McCuen 1987; Burrough 1989, 1992; Ross 1995). 
Our argument is that categories are an unavoidable 
component of model construction and use and are ei- 
ther implicitly or explicitly represented in a model. 
Even models that are composed only of mathematical 
statements rely on underlying conceptual categories, 
such as “habitat” or “suitability.” Managers and plan- 
ners, in turn, often use models as the basis for defining 
(and defending) categories like “suitable habitat” 
(Steinitz 1969). Models are frequently used in applied 
settings to determine whether a certain set of environ- 
mental conditions is “better” or “worse” for a threat- 
ened species, terms which themselves represent cate- 
gories. Models are also sometimes used to make 
decisions about whether certain geographic areas will 
be altered by human land use, perhaps permanently. 
Our concern is that habitat models are sometimes 


used in ways that are inappropriate, given the logical 


assumptions that may be inherent in such models. 
These logical assumptions derive in part from the 
types of categories that are implicitly or explicitly used 
in model construction. We are also concerned that the 
ad hoc use of categories in habitat model construction 
limits the contribution that these models can make to 
theory. 


A Taxonomy of Category Types 
for Habitat Models 


Quantitative modeling must be informed by theories 
of measurement and systems of logic if it is to be use- 
ful in building bodies of theory (Stevens 1946) and if 
it is to be used as defensible support for strategic deci- 
sions (Steinitz 1979; Lemons 1996). This imperative 
emerges from guidelines for theoretical rigor as well as 
from standards of common sense. In 1946, facing a 
lack of guidance for researchers who wished to meas- 
ure human responses to environmental stimuli, 
Stevens proposed a classification of measurement 
scales (nominal, ordinal, ratio, and interval). He in- 
cluded a table of appropriate mathematical transfor- 
mations that could reasonably be applied to each type 
of measurement. This classification was a significant 
aid to consistency in subsequent modeling activities. 
We argue that a similar kind of taxonomy is needed 
today with regard to categories, particularly in the use 
of geographical data for habitat models (Robinson 


oe 
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TABLE 7.1. 


Summary of differences between discretely defined categories and ambiguous categories. 


Discretely defined categories 


Ambiguous categories 


Logical basis Classical set theory 


Membership Concept Discrete, with absolute thresholds 
Membership Function Boolean (yes or no; O or 1) 


Means of Representing 


Uncertainty Probability 
Source of Uncertainty Stochasticity 
Operator for Finding 

Intersection Multiplication 


Fuzzy set theory 
Uses a gradient of similarity to a clearly defined best example 


Graded (O to 1 scale) 


Possibility 
Ambiguity 


Minimum function 


and Frank 1985; Rotenberry 1986; Goodchild and 
Gopal 1989; Stoms et al 1992; Conroy and Noon 
1996; Hunsaker 1996; Flather et al. 1997; Clark and 
Shutler 1999). 

In contrast to Stevens, our focus is on the types of 
categories that can be used to construct models rather 
than on the scales of measurement that may be used to 
represent phenomena. We are concerned that re- 
searchers and managers do not currently share a com- 
mon framework for identifying the kinds of uncer- 
tainty that can be represented by different category 
types. More broadly, we believe there is a need for an 
approach to modeling that identifies the kinds of un- 
certainty that can be accounted for when an investiga- 
tor constructs a habitat model and when a manager 
subsequently applies that model. 

Our proposed approach is based on the simple ob- 
servation that there are two fundamentally different 
types of categories: (1) discretely defined (or *dis- 
crete") categories, and (2) ambiguous categories. Dis- 
crete categories are based in the logic of classical set 
theory, while ambiguous categories have their basis in 
fuzzy set theory. Table 7.1 provides a summary of the 
differences represented by these two category types. 

Discretely defined categories can be defined as cate- 
gories that use discrete thresholds to define limits for a 
variable, regardless of whether the variable itself is nu- 
merically continuous or discrete. Membership in such 
a category is absolute. The logic of membership in dis- 
crete categories is based on Aristotle's Law of the Ex- 
cluded Middle, which states that either an element be- 


longs to a category or it does not. The concept of dis- 
crete categories was fundamental to Aristotle's ap- 
proach to categorization and to the development of 
logical constructs that subsequently affected mathe- 
matical philosophy and the development of classical 
set theory (Apostle 1980). Moreover, the assumptions 
of probability theory require that one must be able to 
define a discrete state or event in order to assign a 
probability to its occurrence (Chernoff and Moses 
198599 Ross 1995). 

Ambiguous categories, also referred to as vague or 
fuzzy categories, are categories that have been defined 
using the concept of similarity to an ideal state or 
event. Membership in these categories is frequently 
defined using a gradient rather than a discrete thresh- 
old. If at least one limit for a category is defined using 
a gradient, then this category must be considered 
ambiguous—even if a discrete threshold is used for the 
definition of membership at the other limit. The the- 
ory that underlies ambiguous categories has been de- 
scribed as a generalization of classical set theory that 
relaxes the Law of the Excluded Middle (Ross 1995 p 
This is now widely referred to as fuzzy set theory. 

As a generalization of classical set theory, fuzzy set 
theory allows membership in a category to be de- 
scribed using either a gradient or a discrete threshold, 
or some combination of both. It is important to note 
that both discrete and fuzzy categories allow for the 
representation and consideration of a particular kind 
of uncertainty. When used in combination with proba- 
bilistic reasoning, discrete categories allow a model to 
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represent uncertainties that result from randomness. 
Fuzzy categories, on the other hand, represent uncer- 
tainties that are non-random. Gradients of member- 
ship in these categories are based on similarity, not 
stochasticity. The different degrees of membership that 
may be assigned to an element represent the degree of 
similarity between that element and some ideal state 
or event. It is impossible to refer to the probability of 
this similarity, since probability theory requires the 
identification of a discrete state or event. Instead, 
fuzzy set theory provides the term “possibility” to 
refer to the likelihood of similarity (Dubois and Prade 
1980). 


Model Objectives, Underlying Theory, and 
the Logic of Category Definition 


When building a model, an investigator must judge 
whether the concepts that underlie the model are in- 
herently discrete, inherently ambiguous, or both. This 
determination provides the model with an explicit 
foundation in current theoretical understandings of 
both logic and biological phenomena such as the spa- 
tial distribution of a biological species or the responses 
of individual organisms to environmental stimuli. 
Without this grounding in theory, models run the risk 
of misinterpretation and a lack of defensibility in an 
applied context (Greenhouse 1997). The development 
and testing of models will also be unlikely to con- 
tribute to future theoretical understanding if the mod- 
els do not explicitly incorporate testable hypotheses 
that originate in a body of theory. 

Habitat models frequently address (1) the likeli- 
hood of the occurrence of a species at a given location, 
(2) the likely abundance of a species at a given loca- 
tion, or (3) the potential of a given location to serve as 
habitat for a particular species. Each of these three 
modeling objectives implies a different set of logical 
and theoretical assumptions, and each allows model 
users to make different interpretations of model out- 
put in an applied context. The fundamental difference 
we see among these models derives from the logical 
basis each uses for making predictions. A probabilistic 
model uses a pattern of discretely defined states or 
events to predict the likelihood of a discretely defined 
state or event. À fuzzy model, on the other hand, uses 


the concept of similarity to predict the possibility of a 
state or event that is ambiguously defined. The prob- 
lem for the model builder, and for the user, is to 
choose the type of model that is appropriate to a given 
question. 

Probability theory may be used as the basis for pre- 
diction if the occurrence of an individual organism at 
a specific sampling location is defined as a discrete 
event and the likelihood of occurrence is determined 
using a record of observations at that location over 
time. Predictions of occurrence can then be made 
using methods derived from the limiting frequency in- 
terpretation of probability theory. This interpretation 
holds that over a large number of instances, a discrete 
event can be predicted to occur with a known fre- 
quency (Hacking 1975). Limiting frequency ap- 
proaches to probability are the basis for most inferen- 
tial statistics and are familiar to most biological 
researchers. However, this approach is not logical 
when a model is constructed using anything other 
than discretely defined states or events. The use of 
probability theory is not appropriate for models that 
describe habitat suitability using the concept of simi- 
larity to some expected optimal conditions. 

The concept of suitability may contain both dis- 
crete and ambiguous dimensions. Some defining crite- 
ria can be discrete, such as whether or not a particular 
tree is wide enough to accommodate the body size of a 
pair of cavity-nesting birds. There is likely to be a dis- 
crete lower limit along this dimension of suitability. 
But the upper limit of this category can be ambiguous. 
There may be no trunk diameter that is too wide for a 
nesting pair of these same birds. Other criteria that de- 
fine suitability may contain ambiguity in both their 
upper and lower limits. Canopy density, for example, 
may be used in some habitat models as a proxy meas- 
urement of food availability for an insectivorous bird 
species. There may be no discrete upper or lower lim- 
its for this dimension of suitability for the species of 
interest, except when there is a total absence of tree 
canopy. 

If a habitat model describes the potential suitability 
of a given location to serve as habitat for a particular 
species, then its logic must be able to account for 
categories that allow a gradient of membership. The 


> 


desire to define “suitable sites,” whether for nesting, 
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foraging, or other activities, requires an investigator to 
answer the question of whether upper and lower limits 
of environmental variation that affects the responses 
of individual birds can be discretely defined. Answer- 
ing this question requires a theoretical framework 
that offers explanations of animal behavior and 
perception (Fig. 7.1). 

Theories of the niche, for example, provide a theo- 
retical concept that can inform the definition of habi- 
tat suitability. Using these theories, suitability would 
be defined using the concept of similarity to an opti- 
mum condition. The appropriate logic for represent- 
ing similarity to an ideal state or event is fuzzy set the- 
ory. Habitat models that address suitability using 
theories of the niche necessarily rely on an underlying 
logic of ambiguous categories. The only way to avoid 
this is for an investigator to employ an underlying the- 
oretical concept that does not rely on the idea that an 
organism has different levels of response to environ- 
mental gradients and that an optimal level exists for 
that organism. 

The niche concept is described using similarity to 
an optimal state. Most habitat suitability models rely 
on this logic as well, since they are explicitly or im- 
plicitly based on theories of the niche. The familiar 
concept of a “suitability index” represents, in logical 
terms, a fuzzy set (Hill 1997)—as does the biological 
concept of the niche (mathematical representations of 
the niche as a fuzzy set are presented by Cao 1995 and 
by Bock and Salski 1998). An investigator who uses 
similarity-based concepts to construct a habitat model 
should approach model construction and testing using 
fuzzy logic as the basis for reasoning with and about 
such a model. Otherwise, the interpretation of the 
model’s results will be inadequately grounded in both 
theory and logic. 

Of course, it is also possible to define “suitability” 
and “potential habitat” using the limiting frequency 
approach of counting occurrences or recording abun- 
dance over time at a given location. But unlike simi- 
larity-based reasoning, this method alone does not 
provide a logical or theoretical basis for making pre- 
dictions in new locations where sampling has not been 
conducted over time, unless the expected variation in 
habitat selection by an organism is assumed to occur 
only as a result of stochastic processes. Predictions 
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Figure 7.1. Use of two different functions to define habitat 
suitability along a range of values for the environmental vari- 
able X: (a) a Boolean function with a discrete threshold, and (b) 
a continuous linear function. 


that are based on non-random variations in an organ- 
ism's process of habitat selection require an underlying 
theoretical concept that describes that organism's re- 
sponses to its environment. Theories of stochasticity 
alone do not provide such a description. 

In common practice, correlational studies that ob- 
serve the presence or abundance of a bird species and 
one or more environmental conditions over a set of 
sampling locations are used as a basis for habitat suit- 
ability models (USFWS 1981a,b). Probability theory is 
typically used to determine whether the variation in 
abundance is greater than would be expected by 
chance. In our view, there is no conflict between the 
use of these methods and the use of a similarity-based 
logic in constructing model categories. The results of 
correlational studies can be the basis for the construc- 
tion of ambiguous categories as well as discrete cate- 
gories. The difference in practice that we do recom- 
mend, however, affects both (1) the choice of 


Intersection is determined using multiplication. 


MO=AxB 


C? 


(a) Probability Theory 


d5 


(b) Fuzzy Set Theory 


Intersection is determined using the minimum function. 


N=min (A,B) 


Figure 7.2. The operations used to find the intersection of 
sets in probability theory (a) and in fuzzy set theory (b). Similar 
operations are used in models based on Boolean logic and in 
models constructed using indices, but the choice of an opera- 
tion is frequently treated as if it were an ad hoc decision. 
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mathematical operations used to combine variables 
within a model and (2) the testing, use, and interpreta- 
tion of the model in an applied context. We discuss 
each of these issues below and provide an example 
from a case study by Hill (1997). 


Choosing Mathematical Operators for 
Combining Categories in a Model 


Stevens pointed out that each scale of measurement 
should be subjected only to those mathematical opera- 
tions that are appropriate to that type of scale (Stevens 
1946; Hopkins 1977). Similarly, the underlying logic 
of the two category types we identify (discrete and am- 
biguous) restricts the range of mathematical operations 
that can be applied in a model (Hill 1997) (Fig. 7.2). 

Our argument requires us to first characterize the 
logical operations used to combine habitat variables in 
a model. If we wish to know where individuals of a 
species will be able to find both food and a nest site in 
the same place, then the appropriate logical operator 
is “and.” We are looking for places where the state- 
ment that “Condition A (food availability) exists and 
Condition B (nest site availability) exists" is true. The 
corresponding term for this logical operator in mathe- 
matical set theory is “intersection” (Tomlin 1990; 
Burrough and McDonnell 1998). 

If the categories defined in a model are all discrete, 
then the intersection of two or more categories may be 
found using the *and" statement provided by Boolean 
logic, in which only two states can exist (true and 
false, or yes and no, or 0 and 1). The mathematical 
operator required to produce an answer of “true” 
(i.e., represented by the number 1) when both Condi- 
tion A and Condition B are also “true” (i.e., repre- 
sented by the number 1) is multiplication. We could 
use a similar logic to treat Condition A or Condition B 
as probabilities. Probability theory finds the intersec- 
tion of two probabilities (stated as the likelihood that 
Condition A and Condition B are true) using multipli- 
cation. Thus the intersection of discrete categories, 
whether these are treated as a Boolean “truth” or as a 
probabilistic likelihood, is found using multiplication. 

However, if any of the categories defined in a model 
use gradients to define membership, then Boolean 
logic and probability theory are not useful. The inter- 


section of two categories where at least one category 
limit is ambiguous may be found using operations de- 
fined by fuzzy logic (Zadeh 1965). The mathematical 
operator that corresponds to a fuzzy “and” statement 
is the minimum function (or a weighted minimum 
function, if the investigator believes there is reason to 
weight the variables that were used to define the cate- 
gories) (Dubois and Prade 1980; Ross 1995). This 
function is, in a sense, more conservative than the 
multiplication operator, because it returns the lesser of 
the two values it combines. Fuzzy logic uses this more 
conservative function because it recognizes that the 
gradations in membership used in two different cate- 
gories represent “apples and oranges"—in other 
words, the two categories refer to different things, per- 
haps using different scales, and therefore cannot be 
manipulated via multiplication as if they were ratio- 
or interval-scale measurements. 


An Example: Building a Habitat Model for the 
Black-capped Chickadee 


The black-capped chickadee (Poecile atricapilla) is a 
common species in several regions of the United 
States. A habitat suitability index (HSI) model was de- 
veloped and tested for this species by the U.S. Fish and 
Wildlife Service (USFWS) using species abundance and 
vegetation data from floodplain forests of the South 
Platte River in Colorado (Fig. 7.3) (Schroeder 19832, 
1990). 

Hill (1997) was interested in the effect that cate- 
gory definition might have on the model's ability to 
predict observed abundance values and its usefulness 
for defining a geographic area that would contain all 
of the sampling locations where chickadees had been 
recorded. These two model performance issues are 
particularly important to planners and managers since 
they must frequently use habitat models to establish 
the geographic boundaries of management zones. 
Planners and managers often have insufficient re- 
sources to collect field data to develop and calibrate 
habitat models specific to their own management 
areas, particularly when they must respond rapidly to 
emerging land-use conflicts. This need accounts for 
the use of the USFWS HSI model in this case study as 
if it were a generic suitability model that could be ap- 
plied anywhere. 
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Nest Availability Food Availability 


Suitability 
Suitability 


0 
0 


0 10 
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Tree canopy volume/ 
Area of ground surface 
(m/m?) 
Figure 7.3. U.S. Fish and Wildlife Service habitat suitability 
index model for the black-capped chickadee (Poecile atri- 
capilla(Schroeder 1983a, 1990). The final habitat suitability 
value is obtained by taking the minimum value of these two 

variables. 


Number of snags 
between 10 and 25 cm dbh 
per hectare 


Hill (1997) created several versions of the USFWS 
black-capped chickadee HSI model, varying only the 
number and type of categories used in the model (Figs. 
7.4, 7.5, and 7.6). Bird abundance data were obtained 
for model validation from an area of approximately 
375 hectares at the Harvard Forest Long-Term Eco- 
logical Research site at Petersham, Mass., by R. Lent. 
Sixty-seven randomly established sample sites were 
used to census the area during the months of May and 
June 1993 (R. Lent unpublished). Figure 7.4 shows 
the distribution of chickadee abundance in relation to 
canopy volume. Canopy volume was estimated from a 
measure of basal area using a relationship presented 
by the author of the USFWS HSI model (Schroeder 
1990). The second variable, number of dead snags per 
hectare, was not influential in this example since the 
number of snags in the sampling locations consistently 
exceeded the threshold value defined in the HSI model 
as suitable habitat (ten snags per hectare). 

The first alternative version of the HSI model pre- 
pared by Hill (1997) was Boolean and contained only 
two categories, "suitable habitat" and *unsuitable 
habitat" (Fig. 7.5). Bayer and Porter (1988) demon- 
strated better predictive performance when they sim- 
plified HSI models using Boolean categories created by 
setting the 0.5 suitability index level as a threshold 
value. Hill (1997) used the same threshold value to 
discretize the continuous habitat suitability output of 
the chickadee model. This discretized version can be 
thought of as a probabilistic model of suitability. But 
the biological basis for this construction of the model 


Abundance vs. Estimated Canopy Volume 


Relative Abundance, BCCH 


Estimated Canopy Volume 


Figure 7.4. Relative abundance of black-capped chickadees 
(Poecile atricapilla) versus estimated canopy volume in the Har- 
vard Forest study area. 


is not clear. It represents the hypothesis that the black- 
capped chickadee uses only forested areas with esti- 
mated canopy volumes of greater than 7.5 cubic me- 
ters per square meter. We are not aware of any 
theoretical basis for that discrete hypothesis. If indeed 
there is none, then discretizing the continuous output 
of a habitat suitability model is simply an ad hoc at- 
tempt to improve model performance. It does not rep- 
resent a hypothesis based in current theory and is un- 
likely to contribute to a theoretical understanding of 
species occurrences. 

The USFWS HSI model can also be constructed 
using ambiguous categories. Hill (1997) assembled 
these using an approach common in fuzzy systems en- 
gineering (Kosko 1992; Ross 1995). Three symmetri- 
cal and overlapping fuzzy categories were used to de- 
scribe habitat suitability along a gradient of canopy 
volume: high suitability, moderate suitability, and low 
suitability. Fuzzy set theory allows sets to overlap 
where elements belong to more than one set. In practi- 
cal terms, this means that a canopy volume estimate of 
7.5 cubic meters per square meter would belong to the 
set of “high suitability" sites to a degree of 0.5 and to 
the set of *moderate suitability" sites to a degree of 
0.5. A membership value of 1.0 indicates an area of 
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Figure 7.5. A two-category Boolean version of the habitat suit- 
ability index model for the black-capped chickadee (Poecile atri- 


capilla). A threshold of 0.5 was used to discretize the original, 
continuous function published by Schroeder (1990). 


the graph that has no overlap between categories (Fig. 
7.6). 

The use of three overlapping categories instead of 
a single fuzzy set (i.e., an index) has become well- 
established in the design of engineering control systems, 
where the three overlapping categories might program a 
microprocessor to adjust the response of a motor to a 
measured environmental condition such as ambient air 
temperature (Kosko 1992). The overlap among cate- 
gories helps to prevent the phenomenon of *overshoot" 
in mechanical systems that respond to environmental 
inputs. 

Hill (1997) suggested that using overlapping cate- 
gories can help biologists and land managers represent 
and perhaps test the concept of a realized niche 
(Hutchinson 1957), while most single-category suit- 
ability indices attempt to represent only the funda- 
mental niche. Her argument was that phenomena such 
as interspecific competition or the temporal order of 
patch colonization by different species may be indi- 
rectly visible in the relationship between the observed 
abundance or presence of a species and the suitability 
ratings provided by each of three overlapping fuzzy 
categories. For example, if the category *moderately 
suitable" best predicts the abundance or occurrence of 
the species, that might be interpreted as evidence that 
the species is not occupying its optimal habitat rather 
than as evidence that the suitability model itself is 
wrong. Bird abundance data from the Harvard Forest, 


for instance, showed that red-breasted nuthatches 
(Sitta canadensis) were most abundant at sample sites 
with the greatest canopy volume. The black-capped 
chickadee might be expected to occupy that same op- 
timal habitat but instead was most abundant at sites 
with moderate canopy volume. 

Abundance is not an ideal indicator of habitat suit- 
ability (Van Horne 1983), and the suboptimal distri- 
bution of chickadees found in this case may have oc- 
curred for various historical or biological reasons. But 
this suboptimal distribution cannot even be perceived 
if a planner or manager simply applies a single- 
category model that rates suitability using expecta- 
tions of optimal conditions. The typical suitability 
index model may simply appear to fail in its predic- 
tions without allowing a researcher or manager to 
form new hypotheses about the biological environ- 
ment in which the species of interest has become es- 
tablished. On the other hand, application of a three- 
category model in which the category “moderate 
suitability" is the best predictor of abundance raises 
the question of why this category predicted best. 


Testing the Accuracy of Habitat Models 


Tests used to establish the predictive ability of habitat 
models require that the data used in these models 


Moderate 


Suitability 


0 5 10 


Estimated Canopy Volume/ 


Ground Surface Area (in m3/m2) 


Figure 7.6. A three-category fuzzy set version of the habitat 
suitability index model for the black-capped chickadee (Poecile 
atricapilla). The triangular functions are used in a symmetrical 
pattern to represent an even distribution of the categories 
along the variable’s range, since no more specific hypothesis 
has been stated. 
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must meet appropriate statistical assumptions. Most 
contemporary tests of scientific models have been ap- 
plied with the assumption that the data should meet 
the requirements of inferential statistics and probabil- 
ity theory. For instance, a correlation coefficient has 
often been used to test predictions of suitability 
against actual abundance. The suitability ratings pro- 
duced by the original USFWS HSI model for the 
black-capped chickadee (Schroeder 1990) did not 
show a significant correlation with that species’ abun- 
dance in the Harvard Forest case study described 
above (Pearson's r = -0.18, P > 0.10). Similarly, a 
Boolean version of this model did not perform well in 
a t-test that compared the mean suitability rating of 
sites with chickadees to the mean suitability rating 
of sites without chickadees (p > 0.10). 

Perhaps our first question should be whether the 
concept of suitability meets the basic requirement that 
we must begin by defining a discrete state or event in 
order to use probability theory. We argue that it does 
not, except in the unusual case where the biological 
theories that underlie the suitability model identify 
only discrete thresholds of biological response. This 
requirement excludes habitat models that are based on 
theories of the niche, since this theory predicts gradi- 
ents of species response. If we wish to maintain con- 
sistency between theory and method in habitat suit- 
ability analysis, we must look for both models and 
tests that do not require the definition of discrete 
states or events. Those tests should be able to enhance 
the defensibility of habitat models in applied decision- 
making contexts as well as establish the conditional 
usefulness of a model for theory building. 

We find that the issue of defensibility is often not 
well understood. From an applied perspective, there 
are two ways a habitat suitability model can fail: (1) it 
can fail to predict the abundance of a species where it 
is indeed abundant, and (2) it can overpredict, rating 
many or all locations suitable although the species has 
not been detected there. The first case represents an 
error of omission in which the model used by a man- 
ager to predict habitat use omits areas where the or- 
ganism is abundant. The second case represents an 
error of commission. Both types of errors undermine 
the defensibility of a model in an applied setting. Yet, 
the presumption in natural resource management 


must be that errors of omission are worse than errors 
of commission if we wish to implement a precaution- 
ary principle for biological conservation (Shrader- 
Frechette and McCoy 1993). 

In order to avoid an inappropriate reliance on 
probability theory via the use of inferential statistics, 
Hill (1997) developed two simple tests for the sake of 
argument. These tests are pragmatic and are driven by 
(1) the planner or manager’s ethical interest in avoid- 
ing irreversible harm to a species of conservation con- 
cern, and (2) a need to enhance the defensibility of 
habitat models. Her first test was designed to identify 
and reject models that produce numerous errors of 
omission. It was intended for application with models 
that try to predict the abundance of a species using an 
estimate of habitat potential. The test plots habitat 
suitability (i.e., habitat potential) against relative 
abundance and treats the number of times when rela- 
tive abundance exceeds the predicted habitat potential 
as instances of model failure (Fig. 7.7). 

This test allows models to be accepted or rejected 
according to the degree to which they underestimate 
habitat potential but does not measure the tendency of 
a model toward overestimation. A second test was in- 
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Figure 7.7. A simple test for habitat models intended to as- 
sess potential habitat rather than to forecast the actual abun- 
dance or occurrence of a species. This test counts only in- 
stances of abundance exceeding the expected habitat potential 
as failures of the model. Both maximum distance of points 
above the 1:1 line and total distance of points above the 1:1 
line may be useful measures to compare the performance of 
different models. 
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Figure 7.8. A simple test to assess the degree to which a 
model of potential habitat overestimates the potential for a 
species to occupy a geographical area. The measured skew- 
ness of a frequency histogram should be a higher number in a 
histogram of suitability values for sites with no chickadees 
present than in a histogram of sites with one or more chick- 
adees present. Ideally, the skewness should be positive where 
no individuals of the species are observed and negative where 
individuals are observed. 


tended to account for this by measuring the degree of 
overestimation, specifically, in models that predict po- 
tential habitat suitability. It simply compares the de- 
gree of skewness in a frequency distribution of habitat 
suitability ratings at sample points where the species is 
present to the skewness of a distribution of suitability 
ratings at sample points where no individuals of that 
species were detected (Fig. 7.8). Skewness should be 
greater where no individuals of that species were ob- 
served (and ideally should be highly positive, although 
even the best model may not produce this result). In 
the case of a model that has more than two categories, 
the tests described above would be applied to each 
category individually. 


Implications for the Use of Habitat 
Models in Planning and Management 


When habitat models are applied to land management 
and land-use decisions, we must be clear about pre- 
cisely what is being modeled. Models that are based 
on limiting-frequency definitions of the probability of 
occurrence for a species should not be used as if they 
were models of potential habitat, and models based on 
similarity should not be misinterpreted as predicting 
the probability of a species’ occurrence. Each type of 
model relies on a different underlying logic, which we 


argue should be made explicit in each modeling effort. 
We have recommended specifically that a taxonomy of 
category types should be used to determine when it is 
appropriate to use probability theory or fuzzy set the- 
ory in constructing habitat models. This also allows a 
model to explicitly represent two different kinds of 
uncertainty: (1) uncertainty that results from random 
variation, and (2) uncertainty that results from gradi- 
ents of species response to environmental conditions. 
Discrete categories can be used along with probability 
theory to represent the first kind of uncertainty, while 
fuzzy categories represent the second. 

Serious errors can be introduced by the application 
of habitat models when researchers or planners and 
managers ignore or misinterpret the fundamental logic 
of the model used. The most significant sources of 
error are (1) the use of discrete thresholds for conven- 
ience when these are not supported by either theory or 
evidence, and (2) the use of an inappropriate mathe- 
matical operation to produce output values from a 
habitat model. These errors are particularly important 
in determining the appropriate geographic area 
needed to conserve a given species’ habitat. 

In situations where discrete thresholds are used in- 
appropriately, geographic boundaries are introduced 
in the mapping of management zones that are actually 
just data artifacts, with no clear connection to a theo- 
retical framework. Frequently, these reified boundaries 
are misinterpreted as having genuine biological 
meaning—particularly when many map layers have 
been combined in an analysis and the source layers 
where these data artifacts originated are no longer in- 
dividually visible. The spatial consequences can be sig- 
nificant in the final map of areas to be conserved if a 
significant proportion of the area has values close to 
the discrete threshold value used in one or more cate- 
gory definitions. These areas represent “gray zones” 
that almost but don’t quite meet the model conditions 
(or conversely, there may be extensive zones that were 
just barely included by the model). A legal suit could 
be mounted, either by conservation or development 
interest groups, contesting the designation of this in- 
termediate zone. Such a suit could significantly impact 
conservation efforts. Sensitivity analyses should be 
conducted before deciding on the functions and 
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thresholds that will be used for category definitions to 
reveal the risk of this type of error. 

When inappropriate mathematical operations are 
used to find the intersection of two or more categories 
in a model, the spatial errors created when these mod- 
“els are used to plan habitat reserves can be even more 
difficult to detect. A final map of habitat suitability 
generated by multiplying two suitability indices (an 
operation appropriate only for probability estimates 
or Boolean categories) will differ from a final map 
generated by taking the minimum value (as is appro- 
priate for fuzzy sets). In general, the greatest difference 
between taking the minimum value and using multi- 
plication is that the final suitability values will be 
highest when all constituent variables have values in 
the midrange of rated suitability. In other words, if 
one suitability variable is rated with a suitability of 
0.5, and a second is also rated 0.5, the combined value 
under multiplication (0.25) is much lower than the 
value that would be obtained using the minimum 
function (0.5). Depending on the threshold value used 
to draw a boundary around suitable habitat areas, a 
planner or manager could be confronted with maps 
that show two quite different geographic habitat areas 
that have been designated using models that are other- 
wise identical. 


Recommendations for the Development 
and Use of Habitat Models 


In summary, researchers and managers alike need 
habitat models that can explicitly represent their as- 
sumptions about uncertainty. This uncertainty may 
arise from ambiguities in the definition of suitability, 
or from stochastic processes that affect species distri- 
butions. Our conclusion is that models that seek to 
predict the potential abundance or presence of a 
species using the conceptual framework provided by 
theories of the niche should represent the inherent am- 
biguity of suitability using a similarity-based logic. 
Models that seek to predict the likelihood of species 


occurrence or abundance using probability theory 
must first define a discrete state or event. Historical 
occurrences or abundance levels may be that event. 
But it is not clear to us that there is a general theoreti- 
cal basis that provides a definition of discrete states 
for habitat suitability. 

The theory of fuzzy sets is being developed rapidly. 
This theoretical work is likely to be directly applicable 
to the work of ecologists and managers who seek to 
describe habitat suitability using models that are both 
conservative (in the sense that they help to avoid irre- 
versible loss of species) and defensible in an applied 
setting. We recommend that managers and planners 
who wish to use previously published habitat models 
should reconstruct these models using three overlap- 
ping fuzzy sets. This simply requires that three sym- 
metrical, triangular functions be superimposed on the 
range of variability in each constituent variable of the 
habitat model and that the minimum function be used 
to provide a final suitability rating. The model cate- 
gories can be tested for errors of omission and com- 
mission using simple tests like the ones we presented 
above. Model calibration may indicate that the moder- 
ate suitability category (or the low suitability category) 
is the better predictor according to these simple tests. 
In that case, we would suggest that managers might 
look for limitations on habitat use that could be caused 
by factors not represented in the suitability model, 
such as competition or predation. 

Although this recommendation may seem simplistic 
to researchers, we are sympathetic to managers! needs 
to use approximate reasoning to come to defensible 
conclusions in situations of uncertain knowledge. We 
believe that the alternative approach we describe offers 
opportunities to reason approximately, develop new 
hypotheses, and make decisions that implement a pre- 
cautionary principle. On the research side, we believe 
that the issues we have raised regarding category defi- 
nition point to a need for closer integration of models 
with underlying biological theories, as attention to the 
defensibility of models in applied settings increases. 
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Use of Regional-scale Exploratory Studies 
to Determine Bird-habitat Relationships 


Jock S. Young and Richard L. Hutto 


he wide geographic extent of regional bird moni- 
toring programs usually makes them nonexpeti- 
mental and exploratory in nature. In such studies, 
variables are often chosen by expedience, and sources 
of variation are not controlled. Even so, there is great 
potential to learn something meaningful about bird- 
habitat relationships when bird distribution or abun- 
dance information is linked with additional informa- 
tion about vegetation characteristics at the sites 
(Wiens and Rotenberry 1981a; Ralph et al. 1995a). 
One goal of bird-habitat relationship studies is to 
identify environmental conditions that presumably 
control the distribution and abundance of a bird 
species. Knowledge of the biologically important vari- 
ables would help us make more informed manage- 
ment decisions and more accurate predictions of bird 
occurrence in new, unsurveyed sites. Because we can- 
not measure all biologically important variables, how- 
ever, the resulting models are heavily influenced by the 
choice of variables and the methods used for ex- 
ploratory analysis. Consequently, there is a critical 
need to discuss the best approaches for getting the 
most out of such observational data sets. 
In this chapter, we present analyses of data from 
the U.S. Forest Service (USFS) Northern Region Land- 
bird Monitoring Program (Hutto and Young 1999). 
Full-scale monitoring began under this program in 
1994. One aim was to conduct long-term monitoring, 


so a series of permanently marked transects were lo- 
cated randomly within a geographic stratification 
scheme. This is not the most efficient design for dis- 
cerning habitat associations, because a preponderance 
of points lie in common cover types (Austin and Mey- 
ers 1996; Heglund, Chapter 1; Austin, Chapter 5). 
Nonetheless, we collected data on local vegetation 
characteristics in the area surrounding each point to 
obtain as much basic habitat-relationship information 
as possible to supplement the population monitoring 
data and to help explain the possible reasons for any 
population declines that may be detected later. 

The distributions of various landbird species across 
cover types in the northern Rocky Mountains have al- 
ready been published (Hutto and Young 1999). How- 
ever, the eighteen cover types used in those analyses 
pooled together a diverse assemblage of vegetation 
structures. No bird species was detected at all points 
within any given cover type. Some of those absences 
may have been due to sampling error, population fluc- 
tuations, or chance, but we assume that there were 
also absences due to finer-resolution variation in habi- 
tat characteristics among the points within a single 
cover type. We need to know what these features are if 
we are to manage bird species successfully. 

In this chapter, we discuss methods for this second 
step: building regression models to expose finer- 
resolution patterns of occurrence due to the continu- 
ously variable nature of vegetation features. We apply 
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our proposed model-building methods to a single ex- 
ample species, the Swainson’s thrush (Catharus ustula- 
tus). We then examine several subsets of the data to 
determine the consistency and accuracy of our results 
under different conditions. Even though this is an ex- 
ploratory exercise, and even though we make use of 
uncontrolled data not designed primarily to expose 
habitat relationships, the resulting descriptive models 
should at least suggest ecological relationships and 
help focus future research. 


Methods 


Sample points were distributed across twelve national 
forests of the USFS Northern Region, in northern 
Idaho and western Montana. Transects were geo- 
graphically stratified by 7.5-minute topographic quad- 
rangle maps throughout nonwilderness Forest Service 
lands and part of the Potlatch Corporation lands in 
central Idaho (Fig. 8.1). Potential transect start points 
were located by positioning a random point within 
each quarter of the quadrangle maps and then finding 
the nearest point on an unpaved, secondary or terti- 
ary, open or closed road or trail (Hutto and Young 
1999). Transects were selected randomly from this po- 
tential set, subject to logistical constraints. The ten 
points along each transect were placed at least 250 
meters (straight-line distance) apart. Each point was 
sampled once during the breeding season in each of 
three consecutive years (1994 to 1996). There were 
some changes in transect locations between years. In 
this chapter, we used the 428 transects that were vis- 
ited in all three years, although point selection criteria 
discussed below resulted in only 292 transects being 
represented in the data set used for analyses. 


Bird Survey Methods 


Our field technique followed recommendations dis- 
cussed by Ralph et al. (1995a) and methods described 
by Hutto et al. (1986). All observers participated in a 
one-week training session. Points were visited once 
each breeding season between mid-May and mid-July. 
All birds seen or heard during a ten-minute count pe- 
riod were recorded, and the distance to each was esti- 
mated (Hutto and Young 1999). Field observers gen- 
erally began counts about 15 minutes after sunrise 
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Figure 8.1. The distribution of permanent landbird monitoring 
transects in northern Idaho and western Montana. 


(after the predawn chorus), and generally completed 
counts within four hours. Counts were not conducted 
on days with continuous rain or strong winds. 


Vegetation Survey Methods 


Field observers first determined whether a 100-meter- 
radius circle around a survey point could be consid- 
ered a homogeneous cover type. If so, further vegeta- 
tion measurements were then taken on relatively 
continuous variables representing vegetation physiog- 
nomy and floristics (Hutto and Young 1999). 

The selection of variables to measure was based 
largely on our understanding of avian ecology from 
the literature and personal observation. Vegetation 
variables involved structural characteristics of the veg- 
etation at different layers, as well as tree species com- 
positions. Emphasis was placed on variables that were 
of potential biological importance for one or more of 
the bird species analyzed and could be collected 
quickly in the field by trained workers. 

We estimated the tree species composition of the 
canopy layer because of the evidence that floristics 
may be important in habitat relationships (e.g., Mac 
Nally 1990b). Different tree species have different ar- 
chitecture and different invertebrate assemblages (e.g., 
Recher et al. 1991). Bird species forage non-randomly 
among plant species (Airola and Barrett 1985; Roten- 
berry 1985), and nests are often placed in some tree 
species preferentially over others (e.g., Martin 1998). 
General surveys of cover types (e.g., Hutto and Young 
1999) have shown that many bird species are non- 
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randomly distributed across stands of different tree 
species. 

Because it had been determined qualitatively that 
the vegetation cover was homogeneous out to 100 me- 
ters, we assumed that quantification of vegetation vari- 
ables within a 30-meter-radius circle (excluding the 
road corridor) was sufficient to represent the entire 
area. Therefore, all vegetation variables used in this re- 
port were estimated to 30 meters, except for counts of 
large-dbh (diameter at breast height more than 40 cen- 
timeters) trees (LGTREE), which were based on an 
11.3-meter-radius circle (excluding the road corridor). 
Ocular estimates of the following variables were con- 
ducted within the 30-meter-radius circle (Hutto and 
Hoffland 1996; Hutto and Young 1999): HEIGHT— 
the typical height of the tree canopy layer; CANOPY— 
the percent cover of canopy trees (larger than saplings); 
SAPLING—the percent cover of sapling trees (between 
5 and 10 centimeters dbh); SHRUB—the percent cover 
of tall shrubs (multistemmed woody plants greater 
than 1 meter tali; BUSH—the percent cover of low 
shrubs (less than 1 meter tall); and GROUND—the 
percent cover of grasses and forbs. 

For tree species composition, we estimated the per- 
centage of the total canopy cover made up by each of 
the tree species indicated below (some associated 
species were lumped together): PIPO—percentage of 
canopy made up by ponderosa pine (Pinus ponderosa); 
PSME—percentage of canopy made up by Douglas-fir 
(Pseudotsuga menziesii); LAOC-—percentage of canopy 
made up by western larch (Larix occidentalis; PICO— 
percentage of canopy made up by lodgepole pine (Pinus 
contorta), SPRFIR—percentage of canopy made up by 
spruce/fir (Picea engelmannii, Abies lasiocarpa); 
MESIC—percentage of total canopy cover made up by 
western red cedar (Thuja plicata), western hemlock 
(Tsuga heterophylla), and grand fir (Abies grandis); and 
DECID—percentage of total canopy cover made up by 
deciduous trees (Betula papyrifera, Populus tremu- 
loides, and P. trichocarpa). 

Each year, new field observers independently esti- 
mated the values of vegetation variables in conjunc- 
tion with the collection of bird data (although tree 
species composition was estimated in 1994 only). Ex- 
cept when noted, the data from the three separate 
years were averaged for each point. 


Analysis Methods 


Points with more than one cover type within 100 me- 
ters were excluded from the analyses, so the estimates 
of vegetation structure could reasonably be expected 
to represent an average of a relatively homogeneous 
area around the point. The bird data were also limited 
to 100 meters so that both the bird and vegetation 
data represented samples of the 3.14-hectare (100- 
meter-radius) area surrounding the point. Some au- 
thors recommend the use of a 50-meter radius for bird 
data, but the effect of restricting count radius is not 
uniform across species (Wolf et al. 1995). Wide- 
ranging species with loud calls, such as the common 
raven (Corvus corax) and pileated woodpecker (Dry- 
ocopus pileatus), were detected within 50 meters only 
5-15 percent of the time. Small birds with soft or 
high-pitched songs, such as the brown creeper 
(Certhia americana) and golden-crowned kinglet (Reg- 
ulus satrapa), were detected within 50 meters 85-90 
percent of the time. Some other species whose songs 
are unmistakable, such as the olive-sided flycatcher 
(Contopus cooperi) and varied thrush (Ixoreus nae- 
vius), were also identified at greater distances, perhaps 
due to observer confidence. Thus, it may be best to 
vary the cutoff radius for different species. Because 
between-species comparisons are not recommended 
for point counts (Wolf et al. 1995), the different radii 
should not be a concern. In the case of the Swainson's 
thrush, only 35 percent of detections were within 50 
meters, whereas 80 percent were within 100 meters. 
The song carries well and is easily identified, so we 
used all detections within 100 meters. To test the ef- 
fect of this decision, however, we also constructed a 
model based on a 50-meter-radius plot. 

To reduce the confounding effects of very different 
cover types, some of which we know this species 
would not occur in, we modeled the habitat associa- 
tions of the Swainson's thrush within the subset of 
conifer forest cover types. We included all points with 
conifer trees ranging from 5 to 35 meters tall and 
from 1 to 80 percent canopy coverage. By restricting 
the data set, we changed the question from the distri- 
bution of a bird species across a wide array of cover 
types to a more refined distribution within a subset of 
cover types. 

Although different points on a transect can be in 
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different cover types and can be argued to be inde- 
pendent choices in habitat selection made by bird 
species with territory sizes of a couple hectares or less, 
the relative health of the local population, local mete- 
orological conditions, and so forth, will always pro- 
duce some dependence in the data within each tran- 
sect. Nonetheless, we used individual points as sample 
units because (1) combining data from all points on a 
transect would create meaningless sample units with 
respect to vegetation variables, given that transects 
run through a series of different cover types; (2) given 
a mixture of cover types on each transect, and the 
elimination of points near edges, we included, on aver- 
age, only 3.8 points per transect; and (3) our emphasis 
on the relative importance of variables rather than 
strict rules for inclusion of variables in the model 
made the sample size a less pressing issue, although it 
was still important relative to the ability of the data to 
support a model with many parameters. We also pres- 
ent models based on one point per transect. 

For the main model of our example species (Swain- 
son's thrush), we pooled the three years of data by av- 
eraging the vegetation estimates over the three years 
and by counting the presence of the species in any of 
the three years as a presence for that point. This 
method allowed inclusion of sites where the species 
was simply missed in some of the three years. Alterna- 
tive approaches are addressed below. 

The statistical importance of a variable in any mod- 
eling procedure depends on how close the mathemati- 
cal form of the model is to the form of the true rela- 
tionship. The simplicity of linear regression has led to 
its adoption as the typical method in wildlife habitat 
studies (Young 1996). However, a linear model as- 
sumes that a unit-unit relationship holds true for the 
entire range of an environmental attribute (Meents et 
al. 1983) so that if more is better, then a lot more must 
be much better (Johnson 1981b). However, simple 
niche theory assumes that organisms respond to most 
important resource gradients in a unimodal fashion. 
This has been standard procedure among plant ecolo- 
gists for decades (e.g., Whittaker 1967; Austin 1976; 
ter Braak and Prentice 1988), but animal ecologists 
have been much slower to embrace it (Young 1996; 
Heglund, Chapter 1; but see Meents et al. 1983; 
Heglund et al. 1994, etc.). There is no particular rea- 


son why the relationship must be Gaussian or even 
symmetrical (Austin 1976); the mode may not even be 
the optimum (Austin 1980). We do not really know 
what the shape is likely to be in any particular case, so 
we used the simplest method possible to pick up at 
least some of any unimodal signal, while adding only 
one parameter, which was the addition of a quadratic 
term (ter Braak and Prentice 1988). 

A significant quadratic term can result from nonlin- 
ear but monotonic relationships, such as asymptotes, 
as well as from unimodal relationships. Such response 
curves would be expected if there were a threshold in 
the response or if we did not have the complete gradi- 
ent for the variable. In such cases, however, the linear 
regression would also show a strong relationship, so 
the quadratic term may not be necessary to determine 
the importance of a variable (although it may help im- 
prove predictions). The inclusion of the quadratic 
term is even more important in cases where linear re- 
gression indicates no relationship. 

If a bird species is associated with one tree species, 
it is likely to show a quantitative response to others 
because the proportions are interdependent. One way 
around this might be an ordination procedure, al- 
though this may result in gradients due.to overall pro- 
ductivity and/or forest structure rather than direct ef- 
fects of tree species composition. We opted to use the 
direct variables to more easily interpret direct effects 
of tree species. We did not consider quadratic terms 
for the tree species variables because sparse data 
(many zeros) would make the additional parameter 
less supportable, and because such relationships 
would have little logical interpretation. 

We visited point counts only once per year. Because 
single visits do not commonly produce multiple detec- 
tions of any one bird species, we used logistic regres- 
sion (Hosmer and Lemeshow 1989) to analyze the ef- 
fects of vegetation variables on the presence or 
absence of each bird species at the points. In addition, 
biases due to detectability and observer variability 
should be less pronounced in presence/absence data 
than in abundance data. 

To begin the model-building process, we discarded 
variables that had exceptionally high p-values in simple 
regressions and variables that made no biological sense 
for the particular species (in this case the Swainson's 
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thrush). We then used the Akaike Information Crite- 
rion (AIC) for model selection (Akaike 1974; McQuar- 
rie and Tsai 1998). AIC incorporates the tradeoff be- 
tween bias and variance as variables are added to a 
model, and it provides a straightforward comparison 
between models that does not depend on a hypothesis- 
testing framework (Burnham and Anderson 1998). It 
moves the emphasis away from p-values and arbitrary 
cutoff criteria (Johnson 19992), and extracts more in- 
formation from the data regarding the relative strength 
of evidence for each variable (and model). 

When choosing among models using statistical in- 
ference, it is best to work with only a few select mod- 
els chosen a priori on biological grounds (Burnham 
and Anderson 1998). However, our data set was both 
correlative and unstructured, and we knew little about 
the expected relationships for many species, so we 
considered this an exploratory analysis. The best 
method of variable selection for modeling such a data 
set has been the subject of much discussion. Stepwise 
selection methods do not necessarily identify the 
*best" model even from a statistical perspective 
(James and McCulloch 1990). On the other hand, all- 
possible-subsets procedures will inevitably lead to 
overfitting of the data, because the model thus chosen 
will be highly specific to the data at hand (Burnham 
and Anderson 1998). In fact, no method can produce 
the “true” biological model from correlative data, and 
some overfitting is perhaps inevitable. We chose the 
most influential variables by forward selection, with 
AIC as the selection criterion for each step. As inclu- 
sion of variables became more uncertain, we modified 
the procedures to more closely resemble an all- 
possible-subsets methodology. This allowed the com- 
parison of many likely models and embraced the idea 
of alternative models and model-selection uncertainty. 
Although we report the model thus chosen, the goal 
was not to produce a single final model but rather to 
determine the strength of evidence for the inclusion of 
each variable (Burnham and Anderson 1998: 202). 


Accuracy Issues 


We did not have independent data for testing the ac- 
curacy of our models, but we have analyzed the data 
in several different ways to get an indication of the ro- 
bustness of the models in terms of the variables in- 


cluded. We performed a cross-validation procedure by 
splitting the full database in half. We sorted transects 
by latitude and longitude and selected every other 
transect for each subset of the data. We then built a lo- 
gistic regression model for each subset and compared 
the classification accuracy of each model when pre- 
dicting the observed data for the other (test) subset 
relative to the training set. For a classification accu- 
racy assessment that was independent of the cut-point 
threshold, we used receiver operating characteristic 
(ROC) plots (Swets 1988; Fielding and Bell 1997; 
Pearce et al., Chapter 32). 

We also checked the consistency of results by build- 
ing a model for each of the three years separately. Ex- 
amining each year separately is not an independent val- 
idation, of course, because we sampled the same points 
and, in some cases, the same individual birds returning 
the next year (or their philopatric offspring). However, 
the results from three consecutive years should be con- 
sistent if we expect the models to perform well on inde- 
pendent data. We used the same methods as above to 
build multiple logistic regression models for 1994, 
1995, and 1996 separately, and compared the classifica- 
tion accuracy to that of the original three-year model 
using ROC plots. We used the same data set for the 
vegetation variables, with averages across all three 
years, so that any year-to-year variability was due to the 
bird data only, whether from sampling error or from 
actual changes in occupation of sites. 


Sample Unit Considerations 


We tested the sensitivity of our results to pseudorepli- 
cation issues by redoing the analyses with one ran- 
domly selected point from each transect. We had the 
luxury of doing this because of the large sample size in 
our regional program. We selected two subsamples, 
each with one randomly selected point from each tran- 
sect. The second subsample was selected without mak- 
ing the points from the first set available for selection 
(i.e., sampling without replacement). The fifty-eight 
transects with only one available point were randomly 
divided between the two subsamples. This produced 
two subsamples of 263 points, each with only one 
point per transect and with no points in common. 
Multiple logistic regression models were built for 
these data sets using the same methods as above, and 
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the classification accuracy was compared to the origi- 
nal model using ROC plots. 


Poisson Regression 


When we pooled the bird data from three years at 
each point, we obtained considerable variation in 
abundance for some bird species, which was lost when 
we converted the data to presence or absence for lo- 
gistic regression (Mac Nally 1990a): A point where a 
Swainson’s thrush was detected in only one year (or 
was mistakenly identified) was given the same impor- 
tance value as a point with many territorial thrushes 
singing every year. To better differentiate the relative 
use of the sites by this species, we reanalyzed the 
three-year data set using the summed abundances and 
Poisson regression (Jones et al., Chapter 35). Count 
data are more likely to follow a Poisson distribution 
than any other readily available distribution (but see 
White and Bennetts [1996] for a recommendation of 
the negative binomial distribution), and the method is 
fairly robust, requiring only that the variance in the 
data be proportional to the mean (McCullagh and 
Nelder 1989). 


Regional-scale Considerations 


In any study of habitat use, the set of “available” loca- 
tions must be carefully chosen (Johnson 1980). If the 
species is not present in some areas for any reason other 
than the variables we have measured in the study, then 
it would be misleading to dilute the data with such ab- 
sences, or “naughty noughts” (Austin and Meyers 
1996), in potentially suitable habitat that is not occu- 
pied for other reasons (e.g., climate or landscape-scale 
factors, or current or historical dispersal barriers). If 
some measured vegetation variables also change across 
the same gradient, then it may look like those measured 
variables are controlling the distribution rather than the 
unmeasured factors that are truly limiting. 

More than one scale is involved here. If data cover 
a large region, the geographic range of a species may 
not extend throughout the entire area. But even within 
a species’ range, there may be suitable habitat that is 
not occupied due to landscape-scale factors. In a study 
of local-scale factors, both of these problems might be 
addressed by using only those transects along which a 
particular bird species was detected. Those occupied 


transects are ones in which the range, landscape, and 
season are apparently appropriate for the bird species' 
presence. If landscape-scale factors are to be included 
in the models, then we would want to analyze all tran- 
sects within the occupied range. 

Choosing the best approach to this problem may be 
a subjective exercise. For example, we detected the 
Swainson's thrush only rarely in south-central Mon- 
tana (Fig. 8.2). However, because this area was well 
within the geographic range of the species (Montana 
Bird Distribution Committee 1996), we felt that it still 
could have been present in appropriate habitat. It 
therefore would be reasonable to use all occupied 
transects as our method to control for landscape for 
this species. However, because this abundant species 
was found on about 85 percent of the transects, this 
alternative procedure was not likely to produce differ- 
ent results. This method may thus be more useful for 
less-common species. We decided to restrict the data 
based on geographic area. Because we were interested 
in the effect of the geographic distribution of western 
larch on the importance of that variable in the habitat 
models, we restricted the data to the geographic range 
of the larch (all forests west of the Continental Divide 
except for the Bitterroot National Forest), which was 
also the area where Swainson's thrushes were most 
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Figure 8.2. The geographic distribution of the Swainson's 
thrush (Catharus ustulatus) across all 1,102 points used in 
the analyses. Closed circles indicate a presence of the Swain- 
son’s thrush in any of the three years and open circles indicate 
absences in all years. Points within transects are nearly con- 
gruent in this depiction. 
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common. We built a new logistic regression model 
using this subset of the data. 

Another regional-scale consideration is a potential 
change in habitat use in different parts of a species’ 
range. We did not pursue this question because, with 
this kind of correlative data, it would be very difficult 
to show that the inevitable differences between the 
models for two areas were due to actual biological dif- 
ferences in habitat selection rather than different com- 
petitive environments or simply sampling error. 


Results 


The final data set included a total of 1,102 points on 
transects visited in each of three years. We considered 
the area around a point to be relatively homogeneous 
(no edges) if only one (559) or none (543) of the three 
observers thought otherwise. 

Almost any data set involving multiple vegetation 
variables will include a number of intercorrelations 
among the predictor variables. The highest correla- 
tions among the predictor variables in our data were 
among canopy cover, canopy height, and number of 
large trees, especially the latter two measures of tree 
size (Table 8.1). The proportion of mesic conifer 
species (western red cedar, western hemlock, and 
grand fir) in the canopy was also highly correlated 
with those three variables, especially canopy cover. 
Sites with more ponderosa pine had the lowest aver- 
age canopy cover. The greatest understory develop- 
ment was under canopies of larch or, secondarily, 
spruce/fir. The proportions of ponderosa pine and 
Douglas-fir were negatively related to understory. Be- 
cause we had already combined the most important 
species associations (spruce/fir and cedar/hemlock/ 
grand fir), most of the correlations among conifer 
species variables were negative. The largest correlation 
coefficient (r) was less than 0.5 (Table 8.1), so with 
our sample size it should be possible to tease apart the 
effects of all variables, at least to some extent. 

Swainson's thrushes were detected in at least one 
year at 555 of the 1,102 points. Therefore, the cate- 
gories of presence and absence were nearly equal for 
the main analysis of the three-year data set using lo- 
gistic regression. The most important variables in this 
main model (logistic regression of three-year averages; 


Table 8.2) appeared to be understory cover consisting 
of tall shrubs and conifer saplings, positive associa- 
tions with larch and mesic tree species and, to a lesser 
degree, canopy cover. 

The classification accuracy (at a cut-point of 0.5) of 
the main model was about 72 percent. When only the 
strongest variables were used (CANOPY, SAPLING, 
SHRUB, SHRUB2, LAOC, and MESIC), then the clas- 
sification accuracy was still 71 percent. 

When the data set was split in half for cross-valida- 
tion, each half had an internal classification accuracy 
of about 73 percent. When each of the resulting mod- 
els was used to predict presence in the other half of 
the data, the ROC plots (Fig. 8.3) indicated a classifi- 
cation accuracy for this test set that was nearly as high 
as for the training set. In fact, the internal classifica- 
tion accuracy of the models for each half were similar 
to that for the main model (Fig. 8.4a). 

When each year was analyzed separately, Swain- 
son's thrushes were detected on 353 of the 1,102 
points in 1994, 294 in 1995, and 281 in 1996. There 
were some differences in the apparent importance of 
the vegetation variables in models for the three differ- 
ent years (Table 8.2), most notably for canopy cover 
and tree species composition. The internal classifica- 
tion accuracy for these models was not quite as good 
as that for the main model (Fig. 8.4b). 

To determine the sensitivity of our results to the use 
of points as sample units, we also randomly selected 
two subsets of the data that consisted of single points 
from each of 263 transects. Swainson's thrushes were 
detected on 128 and 134 of the 263 points in these 
two separate subsets. The models based on these two 
data sets were quite different from one another (Table 
8.2), with more variables being included in the second 
model (including tree size). This second model was 
also the only model that did not include western larch 
as a tree species associate (although it would have if 
only positive tree associations were allowed). Al- 
though the internal classification accuracy for these 
models was better than that for the main model (Fig. 
8.4c), validation of each model using the other subset 
as testing data gave relatively poor accuracy (Fig. 8.5). 

The restriction of data to detections within a 50- 
meter radius did not change the core variables of the 
model! (Table 8.2). The data supported fewer minor 


TABLE 8.1. 


Nonparametric correlation coefficients (Kendall's tau-b) for bivariate comparisons of predictor variables. 


———————— €—— ————— ————— — ÁN E 
SAPLING SHRUB BUSH GROUND HEIGHT  LGTREE PIPO  PSME LAOC PICO  SPRFIR MESIC 


CANOPY +0.09 -0.04 -0.03 -0.12 TOSS 40.31  -0.12 40.02 -0.01  -0.04 -0.06 +0.24 


SAPLING +0.06 +0.02 -0.03 Zori -0.13018 EONS 40.15 «0.04 +0510 OH'6 
SHRUB +0.47 +0.03 <0.01 «0.01 -0.02 +0.03 10.25  -0.16 30:08 — T0710 
BUSH +0.10 +0.01 20:02 < O10 TESORO! +0.21 -0.07 +0.09 +0.04 
GROUND -0.03 —-0.03 «0:01 +0:03 40.08 | «0.01 «0.01 -0.04 
HEIGHT +0.37 40.04 40.06 -0.04  -0.15 -0.03 A +0.23 
LGTREE +0.10 +0.09 -0.04 -0.24 +0.02 +0.16 
PIPO +0.09 -0.13 -0.24 -0.22  -0.15 
PSME 0.05 0.24 0.31 0.23 
LAOC +0.03 +0.07 -0.03 
PICO +0.07 -0.27 
SPRFIR -0.12 
TABLE 8.2. 


Order of selection for variables included in multiple regression models of the habitat relationships of the Swainson’s thrush 
(Catharus ustulatus), using AIC (Akaike information criterion). 


eee 


Models? 
Poisson 1994 1995 1996 Tr A Tr B West 
N= Logit Logit N= N= N= N= = = 
Variable 1102 100m 50 m 1102 1102 1102 263 263 749 
CANOPY + 7C 4c 6c 3c 3c 4c 4c 
bF 8c 9c 5c 4c 
SAPLING + 2c 2c 2c 2c 2c 3c 2€ 1c als 
b+ 5e 2c 
SHRUB + 1c 1c 1° 1c ale alt ac Dc 3e 
b+ 6c 3c 4c 6c 4c Te 
BUSH + 7e 7 14 7 5° 
b+ 8c 8 12 8 
GROUND 
HEIGHT + 9 9 8c 3c 
bF alae 10 9s 3e 
LGTREE - 12c a 8 6c 
PSME E 
LAOC t 3c 3c 5c 5c 4c 2c 4c 2c 
PIPO - 10€ 10 6 7C 
PICO - 6c 
SPRFIR + iig Te 
MESIC + 4c 4c 6c 3c ee 
DECID 


aAll models used logistic regression except for the Poisson model. 

All models used the same vegetation data, with the variables averaged over three years. 

The first three models used accumulated data for abundance or presence of Swainson’s thrush (Catharus ustulatus), over all three years. 

The models designated by dates used presence data from each of the three years separately. 

The Tr A and B models are based on two separate sets with one randomly chosen point per transect, and the last column is based on the subset 
of transects from wesi of the Continental Divide. 

‘Indicates quadratic term for indicated variables; signs are for linear and quadratic term, respectively. 

* Indicates variables included under traditional hypothesis-testing methods. 
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Figure 8.3. Receiver operating characteristic (ROC) plots 
showing the classification accuracy of the models based on 
each half of the data set in predicting the data for the same 
half (training data) and the other half (test data). 


variables, with a shift to understory variables rather 
than tree size. The internal classification accuracy for 
this model was not quite as good as that for the main 
(100-meter) model (Fig. 8.4d). 

The abundance of Swainson’s thrushes at the 555 
points varied between one and eleven (sum of three 
visits; some high numbers brought the accuracy of the 
abundance data into question). More than one indi- 
vidual was detected at 348 points. Therefore, there 
was considerable variation in counts for use in a Pois- 
son regression analysis. The resulting model was simi- 
lar to that obtained by logistic regression (Table 8.2), 


although it was less likely to indicate nonlinear rela- 
tionships and it did not include tree-size variables. 

We detected the Swainson’s thrush on 493 of 749 
points in the northwestern part of the region. The re- 
sulting logistic regression model was similar to that 
obtained for the full data set of 1,102 points (Table 
8.2), although without canopy height, SPRFIR, and 
the quadratic term for canopy cover. 

The AIC method usually indicated additional vari- 
ables beyond those included by traditional hypothesis 
testing with alpha equal to 0.05. Because of the many 
models we examined in this exploratory analysis, it is 
likely that there was some overfitting of the data when 
the best model was chosen according to AIC. 


Discussion 


Two main goals of building regression models in habi- 
tat relationships are to identify biologically important 
variables and to predict the occurrence of bird species 
at previously unsampled sites. The first goal, identify- 
ing environmental conditions that a bird species needs 
to be present and successful, is of obvious scientific in- 
terest. In addition, it is only through understanding 
the true biological processes involved in a species’ dis- 
tribution that we can hope to reach meaningful man- 
agement recommendations. Determining the impor- 
tant variables can be difficult for a number of reasons. 
We know we have not measured many potentially im- 
portant biological variables, such as food resource 
availability (Hutto 1990) or specific nest sites (Martin 
1992). In addition, the biological importance of the 
measured variables cannot be directly confirmed from 
a correlative analysis. It is necessary to assume that 
the importance of the larger biological effects will be 
revealed by the statistical model, especially if several 
subsets of the data are examined, but there is no way 
to know how much of the observed effect is due to ac- 
tual biological processes or to sampling error. This in- 
herent model uncertainty should encourage us to em- 
phasize the strength of evidence for each variable 
rather than try to decide which quantitative model is 
“correct.” The observed evidence of relative impor- 
tance must then be used to form hypotheses for fur- 
ther investigation. 

In this study, the various regression models of 
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Figure 8.4. Receiver operating characteristic (ROC) plots comparing the internal classification accuracy (re- 
substitution) of different models to the main model (see text). (a) Accuracy of models for each half of the 
data compared with main model; (b) accuracy of models for each year of data compared with main (three- 
year) model; (c) accuracy of models for each of two subsets with one point per transect compared with main 
model; and (d) accuracy of model based on 50-meter radius compared with main (100-meter) model. 


Swainson’s thrush habitat relationships were fairly 
consistent in that they all included the same set of 
strongly influential variables (Table 8.2). The variables 
that appeared to be the weakest predictors of Swain- 
son’s thrush occurrence in any one model were the 
same variables that were less consistently included in 
the other models. All of the variables chosen by AIC 
but not by hypothesis testing methods were in this cat- 
egory. In fact, a model with only the strongest and 
most consistent variables had nearly the same predic- 
tive ability as the full model. This suggests that the 


weaker variables were either biologically unimportant 
or inconsistently correlated with the true controlling 
variables. The increased resolution necessary to under- 
stand the possible effects of these variables would re- 
quire a much larger sample size or a more intensive, 
controlled study. 

This is a first attempt at getting a list of variables, 
more or less in order of statistical (but not necessarily 
biological) importance, within this data set. We can 
never be sure if the model reflects true biological rela- 
tionships without confirming the results with inde- 
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Figure 8.5. Receiver operating characteristic (ROC) plots 
showing the classification accuracy of the models based on 
each of two subsets of data (with one point per transect) in 
predicting the data for the same subset (training data) and the 
other subset (test data). 


pendent or experimental data, but managers can still 
benefit from such a model because it helps focus fu- 
ture studies and provides a first approximation of the 
important controlling variables, which can aid in 
management decisions. 

A first step for managers would be to look closely 
at the variables that were most consistently included 
in the models. Clearly, the understory is critical for the 
Swainson’s thrush. Both tall shrub and conifer sapling 
cover were included in every model, usually with the 
first or second strongest associations. Because shrub 


cover and sapling cover were only weakly correlated 
(r = +0.06; Table 8.1), it seems that conifer saplings 
may provide an adequate substitute for this bird 
species as understory structure. Also, there appeared 
to be a threshold (asymptote) in the relationship of 
bird occurrence and understory cover (Fig. 8.6), indi- 
cating that 20-30 percent understory cover provided 
maximum benefit, as might be expected for a shrub- 
nesting species that forages more generally (Ehrlich et 
al. 1988). Because management practices tend to in- 
crease the amount of land with this level of understory 
cover, this species is not likely to be of management 
concern. 

We do not know of any particular biological reason 
for larch to be important to the Swainson’s thrush, 
and this demonstrates the ambiguity of exploratory 
analyses. Larch cover was correlated with shrub cover 
(r = 0.24), but both variables were strongly significant 
in most multivariate models, so it is difficult to know 
whether this was a true biological relationship or an 
artifact of confounding variables. The fact that larch is 
restricted to west of the Continental Divide was our 
main reason for limiting the bird data to this western 
region for one model. In this way, we discovered that 
larch was still an important variable for the thrush 
within the tree’s geographic range, so the apparent as- 
sociation between the bird and tree species was not an 
artifact of geography. Further study would be neces- 
sary to determine if the retention of larch in the land- 
scape is as important for this species as it is for many 
cavity-nesting birds (McClelland 1977), but this may 
be a good example of a relationship that was not ap- 
parent using simple cover type distributions (Hutto 
and Young 1999). 

The negative association of Swainson’s thrush oc- 
currence with ponderosa pine could be due simply to a 
positive association with other tree species, or perhaps 
the thrush does not do well in that type of tree archi- 
tecture. Alternatively, it may have more to do with 
ponderosa pine stands typically having low canopy 
cover or minimal understory. Although the multivari- 
ate analyses may have been able to tease these apart, 
some residual effect probably remained. 

We did not have an independent data set with 
which to test our models. However, we used a number 
of internal validation and classification accuracy 
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Figure 8.6. Occurrence of the Swainson's thrush along an en- 
vironmental gradient representing percent coverage of tall un- 
derstory vegetation (sum of tall shrub and conifer sapling vari- 
ables). Absence = 0; Presence = 1; curve generated by 
LOWESS smoothing. 


procedures to explore the robustness of our results. 
We assumed that the most consistently included vari- 
ables were more likely to have some biological foun- 
dation. This is not only of scientific interest, but 
should also increase the usefulness of the results for 
predicting the presence of the Swainson's thrush at 
new sites, and for estimating its probable response to 
management decisions. 

The cross-validation procedure seemed to show 
that we have created relatively robust and useful mod- 
els. Models based on each of the two halves of the 
data set not only had classification accuracies nearly 
as large as the full model (Fig. 8.42), but also the con- 
sistency of the models in predicting the other half of 
the data was encouraging. These results also suggest 
that doubling the sample size did not result in a 
greatly improved model. 

The combination of three years of data appeared to 
improve the predictive power of the habitat model rel- 
ative to the models based on single years. This is un- 
derstandable given the sampling methods. A point 
count survey provides an incomplete sample of the 
birds in a given area. It is probably common for a 
species to be present but not detected. There is also 
true year-to-year variation in bird occupancy. It is im- 
portant to design surveys to sample a representative 


cross-section of this variation (i.e., multiple years, 
places, etc.). In this respect, the Northern Region 
Landbird Monitoring Program may be unique. Sam- 
ples were large in comparison with other controlled 
studies of habitat use, and data collection was re- 
peated over several years. The results of these analyses 
suggest that improving the accuracy of data at each 
point may be more critical than increasing the total 
number of points. We are beginning to test these ideas 
by examining the accuracy of models for other bird 
species. 

Vegetation variables are subject to measurement 
error and observer variability. There was considerable 
variation in the estimates by the three different ob- 
servers at each point over the years. When the sepa- 
rate years were analyzed using the separate estimates 
of vegetation rather than the three-year average, there 
was a much greater difference between years than that 
shown in Table 8.2. This suggests that observer vari- 
ability can be a serious problem, especially if vegeta- 
tion is measured quickly by crews primarily trained to 
identify birds, or if only one year is available. We took 
the average of all three years for this reason, and it has 
prompted us to subsequently collect more vegetation 
data at the points, using experienced forestry crews 
and additional plots. 

A plot radius of 50 meters is often recommended for 
comparison of bird abundance between different cover 
types because vegetation density can affect the de- 
tectability of individual birds. In addition, if both the 
bird and vegetation data were accurate, we would ex- 
pect the 50-meter-radius model to have greater classifi- 
cation accuracy because the bird data would be more 
tightly associated with the vegetation near the point. In 
this study, however, the 50-meter-radius model was 
slightly less accurate then the main model using a 100- 
meter radius. This suggests that a 50-meter radius may 
have been insufficient to accurately represent occu- 
pancy in the stand. This is even more likely to be the 
case for less-common species. We conclude that the 
100-meter-radius cutoff not only resulted in an ade- 
quate model, but also it may be preferable in studies 
with only one or a few visits to a point, where we are 
most likely to have an incomplete inventory. 

Most bird-habitat relationship models explain only 
a small proportion of the variance in bird presence or 
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abundance (Maurer 1986; Morrison et al. 1987). This 
is due to a variety of factors that have often been men- 
tioned (e.g., Rotenberry 1986; Wiens 1989b,c), and 
most of these are probably exacerbated by the nature 
of large exploratory studies. It is important for us to 
both realize the limitations of the method and to de- 
sign surveys and analyses to decrease the effects of 
these problems as much as possible. 

In spite of the numerous reasons that regional-scale 
monitoring data might not be conducive to rigorous 
habitat analyses, we determined a suite of vegetation 
characteristics that were strongly correlated with the 
presence of Swainson’s thrush in forest stands (Table 
8.2). We think it is very important that such data are 
used as fully as possible, as long as the results are not 
overinterpreted. Managers must be made aware of 
model uncertainties so that potential problems are not 
overlooked when final decisions are made (Conroy 
and Moore Chapter 16). 


We may also wish to use these bird-habitat relation- 
ship models for the second main goal of building re- 
gression models—predicting the likelihood of a partic- 
ular bird species being present at new, unsurveyed sites. 
Such predictions would be more robust and less loca- 
tion-specific if the predictor variables were more rele- 
vant to biological processes (Austin and Meyers 1996), 
but this is not absolutely necessary for a useful model 
as long as the new sites requiring prediction have the 
same correlational linkages between the measured sur- 
rogate variables and the true variables that influence 
bird occurrence. In any case, the expense of measuring 
predictor variables over wide regions may be prohibi- 
tive unless remotely sensed data can be used. There is 
little reason to develop models for region-wide predic- 
tion until we know what variables are likely to be 
available to managers over all target areas. We can 
then determine if models based on those variables are 
adequately robust for management needs. 
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PART 2 


Role of Temporal and Spatial Scale 


Michael L. Morrison 


will briefly summarize the key findings of each chap- 

ter within this section and conclude with recom- 
mendations for future directions in issues of habitat 
scale. Issues of scale were brought to the forefront of 
habitat analysis in the 1980s, as indicated by the at- 
tention given this topic in Wildlife 2000: Modeling 
Habitat Relationships of Terrestrial Vertebrates 
(Verner et al. 1986b). In comparing conclusions 
drawn in Verner et al. (1986b) with those in this sec- 
tion overview and the literature in general (e.g., see 
Morrison et al. 1998 for a review), I think that (1) 
more scientists are aware of spatiotemporal influences 
on habitat use, and (2) more researchers are incorpo- 
rating issues of scale into their studies. In essence, 
Wildlife 2000 raised the issue, and this volume indi- 
cates how far we have followed through with our 
studies in the intervening years. But, as I conclude 
below and elsewhere (Morrison et al. 1998), the stud- 
ies that hold.the most promise for advancing our 
knowledge of wildlife-habitat relationships are rarely 


conducted. 


Recommendations 


The major recommendations of authors in this chap- 
ter can be summarized as follows. I list only the au- 
thors who provide in-depth discussion and data about 
a particular subject or point. 


* Iterative testing of alternative models is needed 
(Conroy and Moore, Chapter 16; Maurer, Chapter 
9; Zabel et al., Chapter 19). 

* Habitat evaluation is influenced strongly by spatial 
scale (Cogan, Chapter 18; Hahn and O'Connor, 
Chapter 17; Johnson et al., Chapter 12; Trani, 
Chapter 11). 

* There is no *correct" scale; the appropriate scale is 
goal dependent (Johnson et al., Chapter 12; To- 
balske, Chapter 15; Trani, Chapter 11). 

e Validation and sensitivity analysis is a necessary 
part of model evaluation (Thomas et al., Chapter 
10; Tobalske, Chapter 15; Trani, Chapter 11). 

* Evaluation of temporal influence on habitat models 
is needed (Greco et al., Chapter 14; Wright and 
Fielding, Chapter 20). 

* The influence of population density on models is 
necessary (Johnson and Krohn, Chapter 13). 

* Reliable field data on animal distribution and abun- 
dance is needed (Johnson and Krohn, Chapter 13; 
Tobalske, Chapter 15). 


The theme that emerges from these recommenda- 
tions is that we are still doing a poor job of thoroughly 
evaluating our habitat models—hence the emphasis on 
iterative testing, validation, and sensitivity analysis. 
Further, there appears to be confusion among many re- 
searchers regarding the *best" scale at which to oper- 
ate. There seems to be an unstated assumption in the 
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literature—especially that of conservation biology— 
that broad *landscape" is the most appropriate scale 
for modeling. Two points are worth making here. First, 
the scale one works at is dependent on the question 
being asked and how varying scale influences results. 
Second, “landscape” to a human is likely very different 
from “landscape” to a small animal. As developed by 
Morrison and Hall in Chapter 2, the ecological con- 
cept of landscape need not be a large area. Last, the in- 
terrelated issues of temporal scale and population den- 
sity, and their influence on habitat use, have received 
little attention. Johnson and Krohn in Chapter 13 have 
done a good job of raising these issues. 

Maurer critiqued Morrison and Hall and others to 
set the stage for promoting what he described as an al- 
ternative view of defining level and scale. Maurer 
(Chapter 9) suggested that the Morrison and Hall ap- 
proach requires observers to define levels in a system. 
However, Morrison and Hall promote using data to 
define levels of organization rather than a priori set- 
ting them (as is done in most studies). Maurer ex- 
pands on this view by using knowledge to test various 
models that make predictions about the relationship 
between a level and scale. Morrison and Hall and oth- 
ers did not develop this approach but certainly did not 
exclude it. In fact, Maurer's argument coincides with 
the spirit and intent of Morrison and Hall, who con- 
tend that an observer-based decision on a specific rela- 
tionship is not appropriate. 


Conclusions 


After reviewing the habitat literature (e.g., Morrison 
et al. 1998), including the chapters in this section, I 
think we should give serious consideration to the fol- 
lowing points: 


* Terminology still needs to be better standardized 
within this volume by adhering more closely to the 


definitions provided by Morrison and Hall or by 
providing specific alternative definitions. Failure to 
clearly define terms, and relationships between 
them (e.g., habitat versus niche), can cause confu- 
sion when attempting to compare studies. 

* The easy studies have been done. Ultimately, for 
many applications we need to quantify those habi- 
tat factors that determine recruitment into the adult 
breeding population. Other models are surrogates 
of habitat quality (which are often appropriate for a 
specific application). Maurer (Chapter 9) noted that 
although mechanistic models are often preferable, 
the appropriate data are seldom available. 

* Recognizing that scale is goal-dependent is, of 
course, critical but can also direct us toward 
potentially weak and misleading results because 
of our tendency to a priori set the scale. The re- 
sults are likely to be artifacts of such arbitrary 
decisions. 

* Little attention is being given to temporal changes 
in habitat use. Differences between seasons, subtle 
changes within a season, and variations across years 
receive little work. 

* The size of the area sampled and its relation to pop- 
ulation dynamics is seldom studied, nor is the loca- 
tion of the study in relation to the range of the 
species (i.e., whether it is on the edge of the range 
or near the center). The failure to consider popula- 
tion dynamics when developing habitat models is 
probably the major weakness in the study of 
wildlife-habitat relationships. 


We have certainly advanced in our studies of 
wildlife-habitat relationships, as shown by the data 
presented in this volume and the questions raised re- 
garding our approaches. However, several central 
areas of research are not being adequately explored, 
and we remain sloppy in our use of terms and expla- 
nations of the associated concepts. 


CHAPTER 


9 


Predicting Distribution and Abundance: 
Thinking within and between Scales 


Brian A. Maurer 


jt iin conservation biology and wildlife manage- 
ment both require ever-increasing sophistication 
from models intended to either describe or predict how 
many individuals of a particular species exist and where 
in a particular landscape they are found. Once data are 
available, decisions must be made based on information 
that is either incomplete, or of unknown reliability. 
How can biologists and managers handle the complex- 
ity and uncertainty inherent in the systems with which 
they deal? An emerging paradigm intended to address 
complexity and uncertainty is based on the idea that 
processes in systems like wildlife populations in human- 
dominated landscapes occur at different temporal and 
spatial scales (e.g., Bissonette 1997a,b). 

Part 2 is about how information from different 
temporal and spatial scales can be used in models to 
strengthen biological inferences and conservation poli- 
cies. Generally, the chapters in this section examine 
wildlife-habitat systems or decision-making problems 
from a multiscale perspective. That is, each system 
considered is assumed to be influenced by a complex 
set of causes whose effects on system behavior occur 
at more than one scale. Although they do not explic- 
itly consider such multiscale processes, Conroy and 
Moore’s (Chapter 16) emphasis on comparison of al- 
ternative models in an adaptive management frame- 
work is broad enough to be used with models that ex- 


plicitly incorporate processes at different scales. 


Here, I consider two related issues dealing with 
temporal and spatial scales. The first is the definition 
and meaning of the term “scale.” A standard defini- 
tion has been offered by Morrison and Hall (Chapter 
2), but having such a definition to build on doesn’t 
mean that we understand the usefulness and limita- 
tions of the concept that generated the definition. 
Thus, the second issue is determining how scale as a 
concept can be applied to analyses of the behavior of 
complex wildlife-habitat systems or conservation deci- 
sion systems. I will consider the chapters in this sec- 
tion in light of these two issues. Assuming that these 
chapters represent the state of the art, it is important 
to understand how close we are to being able to apply 
appropriate concepts of temporal and spatial scale to 
assessing the reliability of the complicated models we 
are capable of developing with modern computer and 
remote sensing technologies. 


Definitions and the Meaning of Scale 


Morrison and Hall (Chapter 2) suggest a standard def- 
inition of scale as “the resolution at which patterns are 
measured, perceived, or represented.” They draw their 
definition of scale from King (1997), who defines scale 
as “the physical dimensions of a thing or event.” Phys- 
ical dimensions, King (1997) explains, imply that 
measurements are taken by an observer. That is, a scale 
does not exist without an observer. A fact yet to 
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become well established in ecology is that an observer 
takes measurements in units (e.g., meters, joules, etc.), 
and manipulations of those measurements ought to in- 
clude the units rather than exclude them (Schneider 
1994). Thus, scales do not exist without observer- 
defined units of measurement. The fact that so many 
statistical calculations are done without reference to 
those units, especially in multiscale analyses, can lead to 
confusion regarding the meaning of those calculations. 

Following King (1997), Morrison and Hall point 
out that the term “scale” does not mean the same 
thing as the term “level.” They offer in the appendix 
to Chapter 2 the confusing definition of level as “the 
level of organization revealed by observation at the 
scale under study.” This is clarified in both King 
(1997) and by Morrison and Hall in Chapter 2. Level 
refers to a rank within a hierarchically organized sys- 
tem. Basically, hierarchies are formed as aggregations 
of systems into larger systems (although there are 
many variations on this theme). Any collection of ag- 
gregations defines a level in the system. For wildlife bi- 
ologists, this means that a level refers to a hypothe- 
sized aggregation (e.g., individuals within populations, 
patches within landscapes). Confusion often arises be- 
cause the scale an observer uses need not correspond 
to any particular level (Schneider 1994; King 1997). 

What is needed to reduce such confusion is a dis- 
tinction between the two types of models represented 
by levels and scales. A level is a theoretical construct 
used to induce conceptual order to the thinking of sci- 
entists studying complex systems. A scale is an empir- 
ical construct used to organize data collected by scien- 
tists measuring complex systems. Relating a range of 
scales to a set of levels can be viewed as a model- 
fitting procedure. Given that there may be more than 
one set of levels that might describe a complex system 
like a wildlife population, a scientist might have sev- 
eral different empirical statements incorporating dif- 
ferent ranges of scales that might be evaluated accord- 
ing to some “goodness-of-fit” criterion to choose the 
“best” alternative. So, in studies of complex systems, 
which comes first, the level or the scale? 

Some authors advocate that recognition of levels 
should be extracted from data rather than assumed to 
exist a priori (King 1997). This approach draws upon 
treatments of evology such as Peters’ (1991) A Cri- 


tique for Ecology, which advocates that induction is 
the only reliable way to understand ecology. The in- 
ductionist view, however, limits the progress that can 
be made in a field of science (Pickett et al. 1994; Mau- 
rer 1999). A complementary approach is the deduc- 
tionist’s view. According to this view, rather than mak- 
ing assumptions about the existence of levels, 
scientists define levels in a manner such that they may 
be evaluated empirically. That is, levels are defined a 
priori and used to construct a set of alternative models 
of the process being examined. The models are used to 
make predictions about measurements taken at differ- 
ent scales. The model defines a relationship between a 
level (e.g., population) and a scale-dependent process 
(e.g., population dynamics). Data collected at appro- 
priate scales are then used to construct tests of the pre- 
dictions made by each model using a well-defined 
goodness-of-fit criterion. As Conroy and Moore 
(Chapter 16) suggest, in a decision-making context, 
this may be an iterative procedure, where what is 
learned at one iteration informs the choices of 
models constructed and scales examined for the next 
iteration. 


Modeling Variable Populations 
in Space and Time 


Given the problems inherent in using the concept of 
scale in a scientifically useful manner, what is missing 
in many current applications of scale to wildlife- 
related problems? I suggest that most often, what is 
lacking is a clear a priori definition of levels of interest 
and the implied processes that result from those defini- 
tions. In the first part of this chapter, I describe a rela- 
tively general definition of levels that might apply to a 
wide variety of wildlife-habitat systems. This model 
implies that spatial and temporal patterns of popula- 
tions within landscapes are complex because the 
processes that cause them are thought to operate at 
many spatial and temporal scales (Villard et al. 1998). 
Furthermore, these different processes are often sto- 
chastic in some sense, so that the effects they produce 
on a population are not consistent over space and 
time. 

In the face of such complexity, it is unlikely that 
any single modeling approach will be successful for all 
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purposes. In the second part of this chapter, I describe 
the strengths and limitations of different modeling ap- 
proaches within the context of the modeling goals of a 
study or management activity. 


Mechanistic and Phenomenological 
Models of Populations 


Because populations are complex and are regulated by 
processes at multiple scales, it is likely that no single 
modeling technique will capture every important spa- 
tial or temporal aspect of a population. Different 
kinds of models have different uses and limitations. To 
understand these limitations, I outline a framework of 
the major pathways of causation assumed to underlie 
patterns of population dynamics and dispersion with- 
in a landscape. 

Consider the following conceptual model as defin- 
ing a set of levels that describe the complexity of a 
wildlife-habitat relationship. At the lowest level, the 
availability and quality of appropriate habitat directly 
controls the ability of individual organisms within a 
population to undergo their life histories (Hildén 
1965; Rotenberry 1981). Individuals with access to 
sufficient resources are able to survive and accrue 
enough energy to reproduce. Availability of habitat 
will also influence the likelihood that individuals will 
disperse in and out of the population (Newton 1998). 
The summed patterns of reproduction, survival, and 
dispersal across all individuals in a population deter- 
mine the rates of change that the population will ex- 
perience. When these rates are played out in space and 
time, patterns of population dynamics and dispersion 
within a landscape emerge (Fig. 9.1). Note that this 
model of causation explicitly recognizes that different 
processes operate at different levels. There are three 
levels: the individual level, the population level, and 
the landscape level. 

Two general types of models have been used to de- 
scribe this general view of how populations change in 
space and time. The first type of model can be termed 
*mechanistic." The intent of such models is to de- 
scribe the detailed relationship between relevant fea- 
tures of the habitat and their effects on the life histo- 
ries of individuals within the population (Fig. 9.1). 
Typically, the behavior of each individual in the popu- 
lation is modeled in relationship to patterns of re- 


Population density Vital rates, 
and dispersion migration rates, 
extinction rates 


A Population/metapopulation A 
models: logistic, Levins |, ^^ 
Statistical models: a 


l 
l 
! 
l 
| 

logisitic regression p 
I 


Habitat ________________=5. Life histories 
of individuals 
Mechanistic models: 
SEPMs 


Figure 9.1. Solid lines represent a simplified schematic of 
pathways of causation relevant to predicting abundance and 
distribution. Mechanistic models attempt to model directly the 
link between habitat characteristics and life histories of individ- 
uals. Population/metapopulation models are phenomenologi- 
cal because they attempt to model the relationship between 
habitat characteristics and population vital rates or metapopu- 
lation colonization/extinction rates without reference to individ- 
ual life histories. Statistical models attempt to relate the out- 
come of population and individual level processes to habitat 
indirectly, without explicit reference to spatial or temporal 
dynamics. 


source distribution within a specific landscape (Pul- 
liam et al. 1992; McKelvey et al. 1992; Judson 1994; 
Raphael et al. 1998; Villard et al. 1998). Individual 
behaviors are often modeled as stochastic processes, 
and the results of the model generally attempt to de- 
scribe in statistical terms the relationship between 
habitat attributes and the dynamics and dispersion of 
the population. Such models are often used to de- 
scribe, at least qualitatively, the ability of the popula- 
tion to persist given different manipulations or 
changes that might be applied to the habitat. 

The second type of model encountered can be con- 
sidered “phenomenological,” that is, such models at- 
tempt to describe the relationship between habitat and 
population dynamics indirectly, without incorporating 
data on the actual life history mechanisms that are ul- 
timately responsible for population change. There are 
two types of phenomenological models. Statistical 
models seek only to uncover a correlation between 
population and habitat patterns. Models of this sort 
have been widely used to assess which particular as- 
pects of the habitat are related to some population at- 
tribute such as density or dispersion (James and 
Shugart 1970; James 1971; Smith 1977; Gauch 1982; 
Pielou 1984). Correlations between populations and 
habitat characteristics uncovered by such techniques 
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have no essential cause-and-effect relationship, and 
hence such statistical models have been criticized as 
being of limited scientific value (Karr and Martin 
1981; Rexstad et al. 1988). 

The other class of phenomenological models at- 
tempts to relate habitat to population parameters, 
such as birth and death rates, or to metapopulation 
processes of extinction and colonization (Fig. 9.1). 
Population parameters such as birth and metapopula- 
tion parameters such as extinction rates are not mech- 
anistic in the sense that they describe statistically a 
large number of individual events. In population mod- 
els, rates are calculated on a per capita basis, which is 
essentially the same as averaging attributes of individ- 
ual organisms, such as clutch size, nesting success, 
etc., across a large number of organisms. Typically, in- 
dividual rate parameters are used to estimate the tra- 
jectory the population is expected to follow over time 
and can be expanded to examine patterns of temporal 
dynamics in space (e.g., Lele et al. 1998). In metapop- 
ulation models, rates of colonization and extinction 
are calculated on a per patch basis, again, essentially 
an averaging of many events across a large number of 
patches. Colonization and extinction rates are used to 
examine patterns of persistence of a species within a 
collection of natural or human-created habitat patches 
(McCullough 1996). 


Multiscale Models and Reliability Assessment 


Given the three types of models described above, what 
are the strengths and limitations of each type, and 
how is this likely to affect their use in management of 
complex wildlife-habitat systems? It should be clear 
after briefly reflecting on the structure of Figure 9.1, 
that each kind of model is explicitly tied into a differ- 
ent scale of measurement and so presumably describes 
different levels of organization. Mechanistic models 
attempt to describe wildlife-habitat systems at a scale 
corresponding to individual organisms. They can be 
used to make statements about higher levels by aggre- 
gating or averaging their output over large collections 
of individuals (e.g., Noon and McKelvey 19962). Pop- 
ulation/metapopulation models do not consider the 
detail of individual life histories but instead assume a 
scale of measurement that implies a spatial aggrega- 
tion of individuals. Metapopulation models differ 


from population models primarily by making assump- 
tions about the dispersal abilities of individuals. If the 
spatial structure of the habitat is discontinuous, and 
the size of the discontinuities are larger than the dis- 
persal ranges of individuals, then a metapopulation 
model is assumed to be superior to a population 
model for describing the spatial pattern of the system. 
Statistical models forego any detail about causation in 
favor of different types of statistical models that corre- 
late attributes of populations at the landscape scale to 
habitat features. 

The strength of a mechanistic model is that, if 
properly validated, it provides a wealth of detailed in- 
formation about how habitat affects individual organ- 
isms. Thus, any descriptions obtained from aggregat- 
ing the results of such models can be defended on 
scientific grounds as a valid description of the causes 
and effects that generate a spatial pattern in a land- 
scape. But the wealth of detail of such models carries 
with it a limitation. Because the ecosystems in which 
individual organisms live are constantly changing in 
space and time, parameter estimates obtained at one 
location or time may not be valid at other times and 
places. Moreover, the structure of a mechanistic model 
must match the details of the natural history of indi- 
vidual species and thus must be created anew for each 
species. Conceivably, model structure could even be 
different for the same species if it lives in different 
habitats at different geographic locations. 

Population/metapopulation models at least par- 
tially overcome the limitations in time and space of 
mechanistic models because they average events across 
space (and conceivably over short periods of time such 
as a single breeding season). However, averaging 
across individuals means that variability among indi- 
viduals must be considered as “process error.” For ex- 
ample, many attempts to estimate extinction times are 
based on stochastic population models that incorpo- 
rate “environmental” and “demographic” variability 
(Leigh 1981; Goodman 1987). From the perspective 
of a mechanistic model, this variability would be im- 
plicit in the different conditions experienced by each 
individual. A population or metapopulation model 
may be more general, and one type of model may be 
applicable to a wide variety of species or locations. 

Statistical models are the most general of all and 
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have no restrictions on the kind of species or habitats 
that can be modeled. They require only that data meet 
specific statistical criteria (such as normality, independ- 
ence, etc.). Often, these criteria are easily met by trans- 
formation, initial exploratory data analyses, and diag- 
nostic statistics. The price paid for such flexibility is 
that such models say little about causation. Only very 
general assumptions about how habitat variables affect 
population dispersion and dynamics need be made, 
such as “population density increases when a particular 
habitat variable increases." Statistical models make 
statements exclusively about pattern at the landscape 
level. Paradoxically, this is true regardless of the scale at 
which habitat is measured. For example, one common 
way of developing a statistical model is to measure 
habitat variables within the territory or home range of 
individual organisms. Correlation is established using 
techniques such as principal components analysis, logis- 
tic regression, and so forth. However, since these statis- 
tical techniques consider neither population dynamics 
nor the behavior or life history of the individuals within 
a territory, they are conceptually linked to the landscape 
level rather than to the level of the individual organism. 
There are no specific cause-effect assumptions needed in 
order to construct the model. 

The strengths and limitations of the different types 
of models described previously have important impli- 
cations for how they are used. There are two general 
uses for these models: statistical description and statis- 
tical prediction. The differences are best recognized by 
examining the goodness-of-fit criterion that is used to 
evaluate the model's reliability. 

Statistical description is accomplished by examin- 
ing how well a model describes the data that are used 
to estimate its parameters. This is what is often 
referred to as “calibration” of a model. Typical 
goodness-of-fit criteria are *proportion of variance 
explained statistics" such as R2 values for regression- 
type models, chi-square statistics based on “observed 
minus expected” calculations, and statistics based on 
likelihood ratios. The emphasis of statistical descrip- 
tion is to produce a model that faithfully represents 
the statistically important aspects of a data set. What 
these aspects are, of course, depends on the kind of 
model being used. 

Statistical prediction is accomplished by examining 


how well a model describes new data that were not in- 
volved in the estimation of its parameters. Typically, a 
data set is divided into an “estimation” data set used 
to obtain parameter estimates and a “validation” or 
“test” data set for which values are predicted using 
the model generated from the estimation data set. 
Goodness-of-fit criteria include cross-validation statis- 
tics and likelihood ratios, where likelihoods are calcu- 
lated separately for the estimation and prediction data 
sets. Statistical predictions made by mechanistic or 
population/metapopulation models tend to be more 
reliable than those made by statistical models. This is 
probably because statistical models tend to have rela- 
tively little cause-effect assumptions built into them. 


Temporal and Spatial Scales: 
An Assessment 


The preceding discussions recognized three important 
aspects of the modeling process when constructing 
multiscale models of complex wildlife-habitat systems. 
The first is the philosophical approach used to assess 
the relationship between scales and levels: inductive if 
levels are inferred from the data, deductive if data are 
used to test predictions about processes inferred logi- 
cally from a theoretical model describing levels. The 
second is model type: mechanistic models that de- 
scribe the details of organismal life histories, popula- 
tion/metapopulation models that describe vital rates 
or colonization/extinction rates, and statistical models 
that are used to establish correlations between habitat 
features and population patterns. The third is the 
modeling goal: statistical description if the model is 
evaluated with data used to estimate its parameters, 
statistical prediction if the model is evaluated with 
data not used to estimate its parameters. 

Decisions regarding each aspect of the modeling 
process are most appropriately made based on the 
context in which the model is constructed and the ulti- 
mate uses that will be made of the model. The choice 
of philosophical approach is not arbitrary. Inductive 
inference of levels is appropriate where there is a gen- 
eral lack of information about the nature of the 
system being studied. The goal of using an inductive 
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approach is to establish to whatever degree of reliabil- 
ity the data allow what natural levels might exist in 
the system being studied. When there is some prior 
knowledge about how the system might be expected 
to behave, then this information can be used to con- 
struct deductive predictions, which can then be as- 
sessed using appropriate statistical techniques. 

The type of model chosen often reflects constraints 
on how much time or effort can be expended collecting 
data relative to the size of the system being considered. 
Because mechanistic models are data intensive, they are 
most likely only applicable when logistical constraints 
allow collection of large amounts of data. For the same 
amount of effort, population and metapopulation 
models generally allow for the collection of data across 
larger spatial and temporal scales than mechanistic 
models. The tradeoff is between intensive information 
about relatively few locations versus larger amounts of 
less-intensive information. Statistical models can be 
constructed using vast amounts of information (e.g., 
remotely sensed images) and across the largest spatial 
and longest temporal scales. The amount of informa- 
tion about the processes of interest is relatively low in 
statistical models, but this low information content al- 
lows for broad, extensive coverage. 

If a model is to be used only to infer something 
about the particular system from which data were 
collected, or to manage that system over a time pe- 


TABLE 9.1. 


riod when conditions aren't expected to change, then 
statistical description should suffice as a method to 
assess the reliability of the model. If a model is to be 
generalized to other systems or used to guide manage- 
ment decisions about other systems, then it is impor- 
tant to assess its reliability at making statistical pre- 
dictions. 

The chapters in this part illustrate a variety of com- 
binations of the aspects of multiscale modeling I have 
described above (Table 9.1). Three chapters described 
studies of decision-making systems that incorporated 
wildlife-habitat models; the others described the devel- 
opment of wildlife-habitat models. Two important 
points emerge from Table 9.1. First, there is a prepon- 
derance of statistical models in these chapters. A cur- 
sory scan of titles of chapters in other sections sug- 
gests similar patterns throughout this volume. The 
paucity of other types of models, I think, reflects the 
reality of conservation decision making with limited 
resources. Time, money, and person-power are too 
limited to be expended on intensive studies except in 
the most exceptional cases. Metapopulation models, 
viability analyses, and individual-based models are 
most often associated with species of economic (e.g., 
game species) or legal (e.g., endangered species) im- 
portance. Most decision making in conservation biol- 
ogy and wildlife management, it seems, will continue 
to be based on statistical models, implicitly limiting in- 


Classification of chapters in Part 2 by the different aspects of multi-scaled modeling they incorporate. 


Philosophical Reliability 
Chapter Author Study type approach Model type assessment 
Cogan Decision making Inductive Statistical Description 
Zabel et al. Decision making Inductive Statistical Prediction 
Conroy and Moore Decision making Deductive Metapopulation Prediction 
Trani Model development Inductive Statistical Description 
Tobalske Model development Inductive Statistical Description 
Hahn and O'Connor Model development Deductive Statistical Description 
Thomas et al. Model development Inductive Statistical Description 
Greco et al. Model development Deductive Statistical Prediction 
Johnson and Krohn Model development Deductive Statistical Prediction 
Johnson et al. Model development Deductive Statistical Description 
Model development Deductive Statistical Prediction 


Wright and Fielding 
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ferences about processes to those that can be meas- 
ured at landscape scales. If most multiscale models are 
in fact statistical models, this means that most multi- 
scale models do not attempt to identify levels by vary- 
ing the scale at which data are collected. Rather, they 
attempt to infer properties of lower levels (e.g., habitat 
suitability, population persistence) from models re- 
stricted to the landscape scale. The models are multi- 
scale because data are collected at more than one 
scale. The scales at which data are collected, however, 
do not necessarily correspond to any particular hierar- 
chical level. It is not impossible to make inferences in 
such circumstances. It simply means that what infer- 
ences are made are likely to contain much more uncer- 
tainty than models where measurement scales corre- 
spond closely with process scales. 

The second point that emerges from Table 9.1 is the 
need for reliability assessments to be based on statisti- 
cal prediction rather than statistical description when 
a model is to be used to make predictive statements. 
Every chapter in this section clearly indicates that their 
models are intended to be used as predictive tools, 
that is, to make statements about systems other than 
the one in which the models were generated. Yet, four 
of the chapters provided reliability assessments based 
on measures of statistical description, not statistical 
prediction. It is incumbent upon the users of multi- 
scale models to make sure they understand the degree 
to which the reliability of the model they are using has 
been assessed and act accordingly. Less credibility in 
the decision-making process should be assigned to 
models evaluated by statistical description. 


Recommendations 


Given that, most of the time, decision makers will be 
limited to statistical models that are conceptually 
linked to landscape scales, it is important that a large 
number of alternative models be evaluated whenever 
possible. Zabel et al., Chapter 19, for example, ini- 
tially examined nearly one hundred models in the first 
phases of their decision-making process. A smaller 
number were selected using statistical description to 
be examined more rigorously using a validation data 
set. This kind of iterative use of models in the deci- 
sion-making process is crucial when individual models 


contain a large degree of uncertainty. Conroy and 
Moore (Chapter 16) describe a rigorous method for 
such iterative decision making and illustrate their 
method using a set of simple metapopulation models 
for two species with conflicting habitat requirements. 
The message to: managers should be clear: when deci- 
sions are to be based on models containing relatively 
large degrees of uncertainty, be expansive in evaluat- 
ing models that are to be used in the decision-making 
process. 

If large numbers of models are to be evaluated, 
more sophisticated reliability assessments are needed 
(Maurer 1998). This is especially true when models 
vary in complexity. With statistical models, it is often 
possible to transform variables to allow the use of 
likelihood based model evaluation criteria, such as 
Akaike's Information Criterion (AIC; see Hilborn 
and Mangel 1997 for a discussion of this statistic). 
AIC can be used in mechanistic and population/ 
metapopulation models when assumptions are made 
about the statistical distributions of model parame- 
ters (e.g., colonization and extinction rates). 

Although it might be preferable to have detailed 
mechanistic models to base conservation decisions 
on, this will rarely be possible. Thus, it is very im- 
portant that when we have detailed life history infor- 
mation about individuals, attempts are made to un- 
derstand how this information can be used to 
understand patterns that emerge at larger spatial 
scales and longer temporal scales. For example, Stith 
et al. (1996) use information on dispersal of individ- 
ual Florida scrub-jays (Aphelocoma coerulescens) to 
define degrees of connectedness among local subpop- 
ulations and to infer the spatial structure of the en- 
tire geographic population of the species in Florida. 
Although not directly applicable to other species, 
such studies can provide a way to construct hypothe- 
ses about spatial structure of populations that can be 
tested using data collected at landscape scales. Greco 
et al. (Chapter 14) describe an example of this kind 
of inference using data on the yellow-billed cuckoo 
(Coccyzus americanus) along the Sacramento River 
in California. 

Explicit consideration of spatial and temporal 
scales in modeling wildlife-habitat systems provides a 
paradigm upon which defensible conservation de- 
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cisions can be made. Although such models will 
continue to have large degrees of uncertainty associ- 
ated with them, when integrated into an adaptive 
management framework, decisions can be made and 


re-evaluated as necessary in relatively objective man- 
ner. This type of adaptive decision making is prefer- 
able to abandoning conservation policy decisions to 
purely political processes. 


CHAPTER 


10 


A Comparison of Fine- and Coarse-resolution 
Environmental Variables Toward Predicting 
Vegetation Distribution in the Mojave Desert 


Kathryn Thomas, Todd Keeler-Wolf, and Janet Franklin 


major constraint to mapping vegetation in arid 
areas is the cost of obtaining suitable imagery for 
direct detection of the vegetation types being mapped. 
Vegetation is usually sparse and of low structure. Direct 
detection requires very high-resolution imagery, which 
is usually cost prohibitive. Indirect methods, such as 
predictive modeling, can be employed in the mapping 
procedure to augment the use of lower-resolution im- 
agery. In this analysis, we examine whether environ- 
mental variables derived from digital data can substi- 
tute for field-derived observations, and we examine the 
effect of resolution differences between field-collected 
variables (fine resolution) and map-derived variables 
(coarse resolution) in estimating vegetation distribution. 
In the Mojave Desert of California, vegetation 
types were mapped at a 5-10-hectare resolution using 
a combination of (1) interpretation of 1:32,000 true- 
color aerial photography, (2) delineation on remotely 
sensed imagery (SPOT panchromatic satellite imagery 
with 10-meter resolution obtained from California 
Department of Fish and Game (copyright CNES/ 
SPOT Image Corp. 1994), and (3) predictive modeling 
in a geographic information system (GIS) environ- 
ment. Distributions of vegetation types were predicted 
by applying decision tree analysis (Michaelsen et al. 
1987; Franklin 1995) to over two thousand vegetation 
plot samples. The independent variables used for pre- 
diction were obtained from coarse-resolution maps 


and include macro topography derived from a digital 
elevation model (DEM) (elevation, slope, and aspect), 
terrain variables (landforms and rock/sediment com- 
position), and regional climate variables (precipitation 
and temperature). 

The data set used to develop predictive vegetation 
models included plot samples that were obtained in 
the field using a two-stage random stratified sampling 
design (Franklin et al. in press) and those obtained 
from existing vegetation studies. Elevation, slope, and 
aspect were directly measured in the field and terrain 
variables were described. 

The predictive modeling approach is based on the 
assumption that the distribution of vegetation is corre- 
lated with environmental factors that can be measured 
in the field and from digital maps. However, environ- 
mental variables measured in the field usually describe 
a variable, such as slope angle or landform type, at a 
different resolution than the value derived from a 
coarse-resolution map of that variable. In order to ex- 
amine this issue we asked these questions: 


1. What is the strength of relationship between envi- 
ronmental variables derived from fine-resolution 
field observations with the same environmental vari- 
ables derived from coarser-resolution digital maps? 

2. What is the relationship of the environmental vari- 
ables, both fine and coarse resolution, to vegeta- 
tion types? 
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The study area for this project is a 5-million- 
hectare area in the eastern Mojave Desert of Califor- 
nia. This area includes all of Death Valley National 
Park, the Mojave National Preserve, Fort Irwin Mili- 
tary Reservation, China Lake Naval Weapons Center, 
Marine Corps Air Ground Combat Center, public 
land managed by the Bureau of Land Management 
(BLM), and some private land. The Department of 
Defense Legacy Program and the Strategic Environ- 
mental Research and Development Program (SERDP) 
funded the project. It was conducted by a team con- 
sisting of federal (U.S. Geological Survey), state (Cali- 
fornia Department of Fish and Game), and university 
(San Diego State University) researchers. The final 
products, including a map of actual vegetation types 
for the central section of the Mojave Ecoregion, can 
be found on the Mohave Desert Ecosystem Program 
web site at http://www.mojavedata.gov. 


Methods 


Measurements of plant species composition and associ- 
ated environmental variables were made during the fall 
of 1997, winter and spring of 1998, and spring of 
1999 on 1,000-square-meter releves. During the 1997 
and 1998 field-sampling season, the releves were 
placed using a random stratified sample based on rep- 
resentative sampling of environmental types at a 
1-kilometer resolution. Environmental types were 
characterized by four climate variables (average winter 
and summer precipitation and average January mini- 
mum and July maximum temperature), geologic sub- 
strate (based on a digitized 1:750,000 geological map 
of California (California Department of Conservation, 
Division of Mines and Geology, originally compiled by 
Charles W. Jennings, 1977), and topographic position 
(Franklin et al. in press). During the spring of 1999, 
releves were placed nonrandomly in order to increase 
the sample size for certain rare or undersampled vege- 
tation types. The coordinate position of each releve 
was determined using a global positioning system 
(GPS) with at least 5-meter accuracy in most cases. 


Classification of Vegetation Types 


Vegetation types as defined by alliances were assigned 
to each plot sample in a two-step process. Grossman 


et al. (1998) defines alliances by their floristic compo- 
sition within the National Vegetation Classification 
System hierarchy (FGDC 1997) and notes they occur 
at the two lower levels of the standardized classifica- 
tion system. Species data for each plot was standard- 
ized to a common nomenclature using The Plants 
Database (NRCS 1999). A combination of classifica- 
tion algorithms (Twinspan, indicator species analysis, 
and cluster analysis) (McCune and Mefford 1997) 
was applied independently to existing vegetation plot 
data and the data collected in our 1997-1999 surveys. 
T. Keeler-Wolf and K. Thomas (unpublished data) de- 
veloped concordance rules between the existing and 
new data to define vegetation alliances and applied 
these rules to the total data set in order to identify 
consistent species groupings. 


Fine-resolution Environmental Variables 


The fine-resolution variables used in the analysis were 
derived from five field-collected variables: elevation 
(meters), slope (degrees), aspect (degrees), landform, 
and rock/sediment composition. Elevation was deter- 
mined using a global positioning system or, in a few 
cases, a 1:24,000 topographic map. Aspect was deter- 
mined by aligning a compass to the direction that 
water would be expected to flow from the plot and 
measured as the degrees from north. Slope was meas- 
ured in that direction with a clinometer. Aspect was 
converted to a "southwestness? index using the trans- 
formation, ((cos (aspect-255) + 1)* 100). This south- 
westness index varied from 0 to 200, with a value of 
300 assigned to flat terrain. Aspects and slope meas- 
urements were made over a slope distance of approxi- 
mately 90 meters. 

The field crew, working in pairs, visually deter- 
mined landform and geological substrate categories. 
They used a 38-category classification of types defined 
using a preliminary classification developed for a par- 
allel landform and rock/sediment composition-map- 
ping project sponsored by the Legacy Program at 
Louisiana State University (R. Dokka, Louisiana State 
University, pers. com.). The categories were aggre- 
gated into fewer types. The seven recoded landform 
categories were (1) rocky highland, (2) arroyo, (3) up- 
land alluvial deposits, (4) wash, (5) fluvial floodplain, 
(6) playa, and (7) dunes and sand sheets. The five 
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composition categories were: (1) igneous, (2) meta- 
morphic, (3) calcareous carbonate, (4) evaporite, and 
(5) sedimentary. The six-person field crew received 
orientation to recognizing landform and composition 
categories, but they were not specifically trained in ge- 
omorphology or geology. 


Coarse-resolution Environmental Variables 


Coarse-resolution variables were derived from digital 
maps in a GIS environment. The UTM location of each 
field sample was associated with digital maps of envi- 
ronmental variables in order to obtain the value at that 
location for the variable. Elevation (meters) and slope 
(degrees) were derived from USGS 30-meter digital ele- 
vation models (DEMs) for the study area. Aspect was 
also derived from the DEM and transformed into a 
southwestness index using the same transformation as 
was used for the fine-resolution environmental vari- 
ables. Landform and rock/sediment composition were 
derived from the digital maps developed at Louisiana 
State University for each of these features. The nominal 
resolution for the landform and rock/sediment compo- 
sition mapping is a 10-hectare minimum mapping unit. 
The coarse-resolution landform and rock/sediment 
compositions were recoded into the same aggregated 
categories as the fine-resolution landform and rock/ 
sediment composition. 


Relationship of Fine- and Coarse-resolution 
Environmental Variables Pairs 


A Shapiro-Wilk W test for normal distribution was 
conducted with the plot sample values for elevation, 
slope, aspect, and southwestness. None of the sample 
distributions were normal. Accordingly, a nonpara- 
metric measure of association was conducted for each 
fine/coarse-resolution pair of environmental variables. 
Significance of each comparison was tested with the 
Spearman’s Rho and Kendall's tau-b test. Correlation 
between the fine/coarse resolution pairs of nominal 
variables was determined with two-way contingency 
table analysis. The Kappa statistic was used to meas- 
ure the degree of agreement (0—1) between the pairs of 
nominal variables where they have the same set of val- 
ues. All statistical analysis was performed using JMP 
IN software, version 3.2.6 (Sall and Lehman 1996). 


Relationship of Fine- and Coarse-resolution 
Environmental Variables to Vegetation Types 


One thousand sixty-four (1,064) plot samples were 
used in this analysis. Samples deleted included those 
for which complete field measurements were not ob- 
tained. The first three eigenvector axis scores were de- 
termined for each of the samples. Scores were ob- 
tained using detrended correspondence analysis 
(DCA) (Hill 1979) in PC-ORD version 3.14 (McCune 
and Mefford 1997) The scores were initially deter- 
mined for all samples pooled (n = 1064) without any 
stratification for alliance type or downweighting of 
rare species. A second set of scores were determined 
for a dataset (n = 1,039) with *playa"-related plots re- 
moved (samples identified as Allenrolfea occidentalis 
Shrubland Alliance, Suaeda moquinii Intermittently 
Flooded Shrubland Alliance, Prosopis glandulosa 
Woodland Alliance, and those Sparsely Vegetated Al- 
liance plots dominated by Distichlis spicata or 
Pluchea sericea. Correlations between each pair of nu- 
merical environmental factors, the fine resolution and 
the coarse resolution, and each axis score were deter- 
mined using Spearman’s Rho and Kendall’s tau-b test 
for continuous variables (elevation, aspect, southwest- 
ness). One-way analysis of variance was used to calcu- 
late the significance of the relationship between the 
nominal variables (landform and rock/sediment com- 
position) and the ordination axis scores. 


Results 


Fine-resolution and coarse-resolution elevation meas- 
ures are highly correlated with each other (tau-b 0.97, 
P < .0000). Fine- and coarse-resolution slope, aspect, 
and southwestness measures are more moderately cor- 
related with each other (slope tau-b 0.64, aspect tau-b 
0.53, and southwestness tau-b 0.56, all P « .0000). 
Landform categories are likewise moderately corre- 
lated between the fine- and coarse-resolution observa- 
tions (Kappa = 0.52). A contingency table with cross 
tabulations (Table 10.1) shows the specific pairs of 
variables. If the coarse-resolution data are assumed to 
be the “true” assignment of landform types, the field 
crew's assessment for Playa is the most accurate 
(12/13, 92 percent), followed by Upland Alluvial De- 
posits (75 percent), Rocky Highland, and Dunes and 
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TABLE 10.1. 


Comparison of field measurement of landform (fine resolution) to measurement from digital maps (coarse resolution). 


Fine resolution 


landform 
Upland 

Rocky alluvial Fluvial Dunes and Row 
Coarse resolution highland Arroyo deposits Wash floodplain Playa sand sheets totals Row % 
Rocky highland ,465 36 97 36 9 3 0 646 72 
Arroyo 2 2 5 11 2 0 0 22 9 
Upland alluvial deposits 20 6 216 38 D 5 ab 288 75 
Wash 0 0 10 3 O 0 0 18 28 
Fluvial floodplain 0 4 2 1 15 27 
Playa 0 0 1 o 0 13 92 
Dunes and sand sheets 2 0 1 3 28 39 72 
Column totals 489 44 337 92 19 25 30 1036 
Column % 95 5 64 3 24! 48 93 100 


Sand Sheets (72 percent each). Fluvial Floodplain (27 
percent), Wash (23 percent), and Arroyo (9 percent) 
have the lowest correspondence. 

Pictures taken at the plot locations by the field crew 
were examined to understand why seemingly obvious 
landform features (arroyos, washes, dunes, and sand 
sheets) were often mis-assigned. Mismatches in the- 
matic interpretation were noted. The field crew often 
called arroyos either washes or arroyos. The field crew 
sometimes labeled sand sheets on slopes as part of the 
larger landform (e.g., Upland Alluvial Deposits) even 


TABLE 10.2. 


though the substrate was sandy. Mismatches in the 
resolution of interpretation also occurred. For in- 
stance, changes in slope direction (which may occa- 
sionally support drainage) were labeled as Rocky 
Highland in the coarse dataset and Arroyo in the fine 
dataset. Other resolution mismatches were noted, for 
example where a plot appeared to include both Up- 
land Alluvial Deposit and Wash, the label applied by 
the field crew varied. Interpretation errors in the 
coarse dataset also caused mismatches. In several 
cases where the field crew called a plot Upland Allu- 


Comparison of field measurement (fine) of rock/substrate to measurement from digital maps (coarse). 


Fine resolution 


Rock/substrate composition 


Igneous Calcareous 
Course resolution volcanic Metamorphic carbonate Evaporite Sedimentary Row totals Row % 
Igneous 394 0 7 (0) 14 415 95 
Metamorphic 74 "20 4 (0) 2 80 0 
Calcareous carbonate n 2 38 0 15 92 41 
Evaporite 29 0 0 3 15 51 14 
Sedimentary 176 5 33 1 20 235 9 
Column totals 710 T 82 8 66 873 | 
Column 96 55 0 46 88 30 100 
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vial Deposit and the coarse label identified the feature 
as Wash, it was noted that features occurred in or near 
the plot that may have been misinterpreted as wash in 
the coarse data (i.e., a dirt road, a mined area, desert 
pavement). Variability in interpretation among field- 
crew members could not be determined because they 
worked in rotating pairs and each landform identifica- 
tion was a team report. 

Rock/sediment composition has poor correlation 
(Kappa = 0.18) between the GIS-derived variables and 
the field-observed variables. The contingency table 
(Table 10.2) suggests that the field crew could reason- 
ably recognize igneous substrate (394/415, 95 per- 
cent). However, the crew attempted classification in 
only 82 percent of the plots (873/1,064) and 81 per- 
cent of the time (n = 710) they determined the plot 
substrate was igneous. It seems that the crew only felt 
comfortable recognizing igneous substrate. 

T. Keeler-Wolf (unpublished data) described forty- 
two vegetation types alliances and each plot (observa- 
tion) in the modeling dataset was assigned to one of 
these types. The alliances were each represented by 
varying numbers of samples, ranging from 2 to 311 
(Table 10.3). Each plot also received scores on the 
DCA axes based on the results of the ordination 
analyses. Inspection of the individual plot scores de- 
termined that certain vegetation types, in particular 
playa types, were largely influencing the ordination 
scores for the 1,064 samples. Therefore, results are 
presented for the ordination based on 1,039 observa- 
tions (with playa types removed). With these plots re- 
moved, elevation was the variable most strongly cor- 
related with DCA axes 1 and 2 (Table 10.4). Slope 
and southwestness were significantly but weakly cor- 
related with DCA axis 1. In all cases correlation of 
coarse variables to axis scores was equal to or slightly 
higher than those derived from fine variables (Table 
10.4). 

Both the fine- and coarse-resolution-derived values 
for landform are most strongly related to the second 
axis (Table 10.5). The coarse-resolution-derived val- 
ues for rock/sediment composition also shows signifi- 
cant relationship to the second axis. The differences 
between the fine/coarse pairs for each axis score were 
greater than those for continuous variables. For the 
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Preliminary alliance types described by Keeler-Wolf 
(unpublished data). 


Preliminary alliances No. samples 
Acacia greggii Shrubland Alliance 16 
Allenrolfea occidentalis Shrubland Alliance 8 
Ambrosia dumosa Dwarf-shrubland Alliance i 34 
Artemesia nova/Mortenia utahensis 

Dwarf-shrubland Alliance 5 
Artemesia tridentata Shrubland Alliance 10 
Artemesia tridentata-Ephedra viridis Shrubland Alliance 4 
Atriplex canescens Shrubland Alliance 3 
Atriplex confertifolia Shrubland Alliance 44 
Atriplex hymenolytra Shrubland Alliance 29 
Atriplex polycarpa Shrubland Alliance 3 
Coleogyne ramosissima Shrubland Alliance 39 
Encelia farinosa Shrubland Alliance 18 
Ephedra nevadensis Shrubland Alliance 10 
Ephedra viridis Shrubland Alliance 3 
Ericamerica nauseousus Shrubland Alliance 7 
Eriogonum fasciculata Shrubland Alliance 10 
Grayia spinosa Shrubland Alliance 15 
Hymenoclea salsola Shrubland Alliance 20 
Juniper spp. Wooded Alliance 14 
Larrea tridentata Shrubland Alliance 126 


Larrea tridentata/Ambrosia dumosa Shrubland Alliance 3141. 
Larrea tridentata/Encelia farinosa Shrubland Alliance 43 


Lycium andersonii Shrubland Alliance 7 
Menodora spinosa Shrubland Alliance 10 
Pinus monophylla Wooded Shrubland Alliance 9 
Pinus monophylla Woodland Alliance 5 
Pinus monophylla/Juniperus spp. Wooded Shrubland 28 
Pleuraphis rigida or P. jamesii Herbaceous Alliance 16 
Prosopis glandulosa Shrubland Alliance 10 
Prunus fasciculata Shrubland Alliance 13 
Psorothamnus spinosa Wooded Alliance 5 
Salizaria mexicana Shrubland Alliance 14 
Senna armata Shrubland Alliance 2 
Sparsely Vegetated Type 56 
Suaeda moquinii intermittently Flooded Shrubland Alliance 5 
Viqueria parishii Shrubland Alliance 6 
Viqueria reticulata Shrubland Alliance 6 
Yucca brevifolia Wooded Shrubland Alliance 41 
Yucca brevifolia/Coleogyne ramosissima 

Wooded Shrubland Alliance 14 
Yucca brevifolia/Juniperus spp. Wooded Shrubland 

Alliance 10 

Yucca brevifolia/Pleuraphis spp. Wooded 

Herbaceous Alliance 4 
Yucca schidigera Shrubland Alliance 44 
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TABLE 10.4. 


Correlation of environmental variables with ordination scores 
for alliances. 


DCA Axis 1 DCA Axis 2 DCA Axis 3 

Fine Coarse Fine Coarse Fine Coarse 
Elevation 0.62 063 0.31 0.31 0.03 0.03 
Slope DONUM ONES 0.02 0.04 -0.19 -0.2 
Aspect -0.02 -0.04 -0.02 -0.03 0 0.03 


Southwestness -0.07 -0.08 0O _ 0.01 -0.03 -0.03 


Note: Shaded cells indicate P < .001. 


TABLE 10.5. 


Landform and rock/sediment composition relationship with 
ordination axis scores for alliances. 


DCA Axis 
DCA Axis 1 DCA Axis 2 DCA Axis 3 
Fine Coarse Fine Coarse Fine Coarse 
Landform 8 14 i 21 ái 16 
Rock/Sediment 
Composition 13 8 i 10 27 1 


Note: All cells P « .05 

Cell value = F statistic; n = 1,039 sites 

Critical Value (95% CI) for all fine landform = 3.7 

Critical Value (9596 CI) for all coarse landform = 1.8 

Critical Value (9596 CI) for all fine rock/sediment composition 
z 5.6 

Critical Value (9596 CI) for all coarse rock/sediment composi- 
tion = 1.7 


first and third axis, the fine-resolution variables have a 
stronger relationship to rock/sediment composition. 


Discussion 


This paper investigates and tests the assumption that 
the use of independently derived digital data (DEM, 
landform, and geology) is useful for vegetation pre- 
diction and mapping. If digitized or interpolated envi- 
ronmental data are reliable for the analysis of ecologi- 
cal determinants of vegetation patterns across 
gradients, some effort could be eliminated during 
large-resolution botanical surveys. Plant ecologists can 
then estimate the composition, structure, and cover of 
vegetation in samples and rely on independently de- 
rived environmental data for the investigation of vege- 
tation-environment patterns. 


The values for environmental variables that we 
compared are reasonably well correlated between field 
(fine) and GIS (coarse) data sources except for the 
rock/sediment composition. The highest correlation is - 
for elevation. This is not surprising as both the fine- 
and coarse-resolution sources of data are derived 
from accurate methods, global positioning system (ac- 
curacy within meters) and 30-meter digital elevation 
models (DEMs). The digital elevation model is very 
acceptable as a substitute for field-derived elevation 
readings. 

Slope, aspect, and southwestness were only moder- 
ately correlated between the fine (field) and coarse 
(GIS) variables. Slope, aspect, and southwestness were 
integrated over a 90x90-meter area that may accentu- 
ate systematic and nonsystematic errors within the 
DEMs. In the field, one would expect that some sub- 
jectivity is involved in determining the aspect and 
slope, especially in uneven terrain. Other researchers 
have found correlations between field-measured and 
DEM --derived slope and aspect of 0.38 to 0.48, or root 
mean square errors ranging from 9 to 56 (for example 
Davis and Goetz 1990; Franklin 1998; Wise 1998). 
Despite this, the correlation for fine- and coarse- 
resolution values with the ordination axis is similar. At 
the resolution of this project, 5 hectares, use of DEMs 
to calculate slope, aspect, and southwestness appear 
to be a reasonable substitute for field measurement. 

The lack of correlation between the fine and coarse 
description of landform is significant. Differences in 
definition of the landform categories as applied in the 
field versus the GIS mapping, scale of interpretation, 
and mistakes in interpretation can explain differences 
between the descriptions. The fine-resolution assign- 
ment of landform type by the field crew appears to be 
more correlated with vegetation composition than 
with the coarse-resolution map (Table 10.5). The ag- 
gregation of landforms may have masked important 
influence of some landform categories. 

The poor correlation between the fine- and coarse- 
resolution rock/sediment composition measures can be 
attributed to field-crew error. However, the F statistic 
is low but significant (Table 10.5) for both the fine- 
and coarse-resolution measures. It appears that in the 
Mojave Desert few geological composition classes are 
significant determinants of plant distributions. Alter- 
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nately, it may be that the salient substrate features are 
not captured by the classification used regardless of 
the resolution. For example, it is known that some 
species respond to caliche layers (McAuliffe 1994) and 
others to salinity gradients (Hunt 1966; T. Keeler-Wolf 
unpublished data; Wallace et al. 1982; West 1983). 

The results reported can be used by land managers 
and researchers to set guidelines for collection of field 
data and use of independently derived digital data. 
Digital elevation models and derived variables are 
suitable surrogates for medium-resolution interpreta- 
tion (5 hectares). The usefulness of landform data may 
vary by landform category. It is evident that resolution 
can influence interpretation, particularly for drain- 
ages. Although it seems that a desert wash is an easy 
feature to define, the study demonstrates our ob- 
servers placed drainages in a variety of categories. Not 
only are standardized definitions of landform features 
needed but also calibration in interpretation is recom- 
mended. Although the field crew received at least two 
training sessions in substrate identification, more sub- 
stantial training in geology was needed to eliminate 
uncertainty. 


Summary 


In summary, field-collected environmental variables and 
site/sample-specific environmental variables derived 
from a digital elevation model are equivalent for the 
purpose of modeling medium-resolution Mojave vege- 
tation patterns. Substrate data derived from digital 
sources are not as readily interchangeable with field ob- 
servations of substrate. However, some of these differ- 
ences, at least for landforms, may be the result of differ- 
ent definitions for the same category. A different 
aggregation of the categories may have yielded a higher 
correlation between the variables and less difference in 
their correlation with ordination scores. 


This study supports the use of independently de- 
rived digital data but emphasizes the need to con- 
sider the resolution, underlying purpose, and classifi- 
cation of the physical data before it is used as a 
surrogate to field-collected environmental data. The 
accuracy of the predictions based upon digital surro- 
gates to field-collected data can be adversely affected 
if categorized data is casually used without consider- 
ation of the resolution of the field-collected training 
data. We recommend that elevation from DEMs and 
derived products such as slope, aspect, and south- 
westness can be readily used in predicting meso-scale 
vegetation patterns. Interpreted data such as land- 
forms and rock/substrate can also be used; however, 
more caution is advised. 

It is recommended that users of a vegetation map 
developed by prediction with coarse-resolution 
model parameters be informed of the methodology 
of map development and the resulting limitations 
and/or cautions that may be resolution related (Mau- 
rer, Chapter 9). For example, land managers seek in- 
formation on vegetation distribution for manage- 
ment planning yet are often skeptical of the outcome 
(K. Thomas U.S. Geological Survey personal obser- 
vation). These users may wish to use vegetation 
maps at a scale parallel to that from which a field 
crew observes data. This chapter shows how differ- 
ences in the fine view and the coarse view become 
more apparent for variables that have been derived 
from interpretation in the field rather than measure- 
ment (e.g., landform, rock/substrate composition). 
Land managers must be cognizant of these limita- 
tions and be advised by the map developer of the in- 
fluence of resolution on map development and appli- 
cation. As pointed out elsewhere in this book 
(Henebry and Merchant, Chapter 23), *we are still 
learning to use the ‘macroscope’,” both the users and 
developers of predictive mapping. 
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The Influence of Spatial Scale on 
Landscape Pattern Description and 
Wildlife Habitat Assessment 


Margaret Katberine Trani (Griep) 


fundamental theme of landscape ecology centers 

on the influence of spatial pattern upon the 
abundance and dynamics of species (Levin 1992). 
Landscape ecology includes the study of the patterns 
in communities and ecosystems, and the processes that 
affect those patterns. The structure and dynamics of 
communities are strongly influenced by the variation 
in patterns over large regions. 

Landscape pattern description is influenced by the 
scale of observation and can alter the description of 
species distributions (Meentemeyer and Box 1987; 
Holling 1992; Levin 1992). If spatial scale influences 
landscape analysis, then the habitat assessment result- 
ing from such analyses may be affected. The analysis of 
landscape pattern occurs at several scales (grain and 
extent); landscapes described at one scale are unlikely 
to be the same as those described at another. How to 
integrate landscape measurements made at disparate 
scales (Musick and Grover 1991) and how to extrapo- 
late information from one spatial scale to another re- 
mains a problem. There is a need to consider how the 
scale of examination may limit or add to observable re- 
lationships (Allen and Starr 1982). Quantitative analy- 
sis of these relationships may clarify these concerns. 

Landscape area (extent) and the resolution of ob- 
servation (grain) characterize spatial scale. Spatial res- 
olution, as used here, refers to the level of detail inher- 
ent in spatial data (i.e., the smallest discernible spatial 


unit). The objective of this chapter is to examine how 
changes in spatial scale influence landscape pattern 
analysis. First, changes in pattern metrics as a function 
of spatial scale are presented. Second, semivariogram 
analysis is used to assess the variability of metric be- 
havior. Finally, the implications for wildlife habitat 
evaluation are discussed. 


Methods 


Fourteen pattern metrics that express aspects of spa- 
tial heterogeneity, fragmentation, and edge character- 
istics were selected for analysis (Table 11.1). Selection 
was based on their potential relevance to wildlife re- 
sources (Trani 1996; Trani and Giles 1999). These 
metrics are described briefly below. 

The spatial heterogeneity metrics express the com- 
plexity and variability among the land classes occur- 
ring on a landscape. The Simpson and binary compar- 
ison matrix indices are based upon the number and 
proportions of land classes. Evenness refers to how 
abundance is distributed among the classes present on 
the landscape, while interspersion reflects the arrange- 
ment of those classes. 

Fragmentation metrics describe the amount of for- 
est cover on a landscape. This includes patch metrics 
that reflect the count (number of forest patches), size 
(mean patch size), or degree of isolation (interpatch 
distance) on a landscape. Fragmentation Index I 


141 


142 PREDICTING SPECIES OCCURRENCES 


TABLE 11.1. 


Landscape pattern metrics selected for spatial scale 


modeling. 


Landscape metric 


Selected reference 


Spatial Heterogeneity 
Simpson Index 
Landscape Evenness 
Interspersion 

Binary Comparison Matrix 


Fragmentation 
Fragmentation Index | 
Fragmentation Index II 
Percent Interior Forest 
Number of Forest Patches 
Mean Patch Size 
Interpatch Distance 
Percent Forest Cover 


Pielou 1977 
Romme 1982 
Eastman 1997 
Murphy 1985 


Monmonier 1982 

Ripple et al. 1991 

Dunn et al. 1991 

Trani 1996 

Dunn et al. 1991 

Urban and Shugart 1986 
Lauga and Joachim 1992 


Contiguity Index LaGro 1991 


Edge Characteristics 
Total Forest Edge 
Convexity Index 


Ranney et al. 1981 
Berry 1991 


reflects the number of distinct landscape regions on a 
map relative to the total number of map pixels. Frag- 
mentation Index II (the average distance to non- 
forested areas) and percent interior forest (the amount 
of forest area remaining after buffer removal from the 
edge of each forested tract) describe the distribution 
and amount of forest cover. Forest contiguity ex- 
presses the spatial connectedness or the unbroken ad- 
jacency of a landscape. 

The edge metrics characterize areas where two dif- 
ferent land classes come together. Total forest edge 
refers to the length of edge that exists at the interface 
between forest and other land classes, while the con- 
vexity index is a perimeter-to-area ratio that describes 
the amount of edge per unit area of forest. 

Twenty-two forested landscapes (scale 1:24,000) 
that represent a broad array of landscape conditions 
were selected from the George Washington and Jeffer- 
son National Forests, Virginia. USDA Forest Service 
resource specialists visually regrouped the landscapes; 
the criteria used to assign membership centered on the 
spatial arrangement and the number of vegetation 
communities and land classes. The maps were placed 


into one of two categories: simple or complex land- 
scape characteristics. Maps represented the simple 
landscape group with relatively few polygons per 
landscape area, such as those with large, continuous 
forest blocks with few land classes. The complex land- 
scape group contained numerous polygons per land- 
scape area, characterized by numerous forest patches, 
irregular forest boundaries, and diverse arrangements 
of land-use classes. 

These maps were used as initial conditions for the 
cartographic modeling process. Digitized using a 30- 
meter pixel (0.09 hectare) resolution, the maps served 
as a baseline reference for making comparisons with 
alternative maps of coarse resolution (pixel size 
greater than 30 meters). Producing the base map at 
the finest resolution available made it possible to take 
measurements over an increasing range of spatial 
scales. Changes in spatial scale were modeled using 
Idrisi (Eastman 1997). The baseline map was repeat- 
edly generalized at a sequence of pixel sizes (30, 60, 
. . . 420 meters) using 30-meter increments; each map 
was reduced by the number of rows and columns 
while pixel size was enlarged simultaneously. Each 
succeeding image was derived from the 30-meter base- 
line reference map and was independent of each pre- 
ceding image. This procedure provided a consistent, 
repeatable means of map generalization. The suite of 
landscape metrics was tabulated at each level of spa- 
tial resolution. 

Regression analysis based on the sum of least 
Squares was used to generate trend lines illustrating 
the relationship between each metric and spatial reso- 
lution. Trend lines were used solely to provide a useful 
summary of the data and have no statistical signifi- 
cance. The plotted lines having the best fit were se- 
lected on a visual basis. Semivariogram analysis was 
used to examine metric variability as a function of 
spatial scale. The semivariogram summarizes spatial 
variation in magnitude and general form (Oliver and 
Webster 1986), relating semivariance to the spatial 
distance between measurements (Curran 1988). The 
semivariance, yp, is defined as 


n-b 
yp = 2 Zi- Zi. / 2n 


Semivariance . 


——— 


Pixel Size 


Figure 11.1. General form of the semivariogram. The sill rep- 
resents the maximum level of semivariance observed. The 
range indicates the place on the x-axis where semivariance 
reaches 95 percent of the sill. 


where h is the distance over which yp is measured, Z; 
is the value of an metric taken at resolution level i 
(e.g., 30 meters), Zi +h is another measurement taken 
h levels away (e.g., 60 meters), and n is the number of 
observations used in the estimate of yp. Semivariance 
is one-half of the mean squared differences between 
metric values separated by a level h apart. The semi- 
variogram becomes a plot of yp as a function of h. 

At 30-meter resolution, the value of a metric is 
compared to itself and the semivariance is zero. At 60 
meters and beyond, semivariance rises when the com- 
parisons are increasingly different from those obtained 
at 30 meters. This increase continues until the values 
are no longer related to each other and their squared 
difference becomes equal to the average variance of all 
samples. The graphical line of yp levels off and be- 
comes flat (the sill), representing the maximum level of 
semivariance observed (Fig. 11.1). The point on the 
x-axis where semivariance reaches approximately 95 
percent of the sill is the range and serves as an esti- 
mate of area similarity (Yost et al. 1982). At distances 
closer than the range, the values are considered scale- 
dependent (i.e., they become more alike with decreas- 
ing distance between them). Scale-dependence de- 
scribes the relationship between the magnitude and 
variability of a process and the scale of measurement. 
Beyond the range, metric values are considered scale- 
independent (i.e., the values do not reflect the scale of 
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measurement). The general form of the semivariogram 
relies on pixel area, the spacing of those pixels, 
and the pattern metric computed based on landscape 
characteristics. 


Results 


Spatial scale (grain) influenced pattern metrics in a va- 
riety of ways, and several factors changed metric val- 
ues repeatedly. The first factors related to the underly- 
ing pattern of the landscape included the number, size, 
shape, and distribution of land classes. These are de- 
scribed with the specific pattern metrics. The second 
group of factors reflected the sampling procedure that 
was used in the spatial scaling process; these included 
sampling intensity and sample unit size. 

Sampling intensity. During the modeling process, 
each succeeding map had a reduction in the number of 
pixels representing the pattern of the original land- 
scape. Each landscape image contained 25 percent of 
the pixels contained in the previous map. The reduc- 
tion caused problems inherent to small sample sizes 
(e.g., the smaller the sample, the less representative it 
becomes of the sampled landscape). Accuracy of sev- 
eral metrics was closely linked to sample size. At each 
new resolution, a subset of pixels was selected to rep- 
resent the landscape; as each subset became smaller, it 
no longer accurately reflected the original landscape. 

The variance associated with the estimates of met- 
ric means appeared inversely proportional to sample 
size (O'Neill et al. 1991). Variance increased as sam- 
pling intensity decreased with the increase of pixel 
size. The degrees of freedom for each variance esti- 
mate were also reduced at each level; the reliability of 
semivariance estimates decreased as the resolution was 
reduced. This occurred because the number of obser- 
vations decreased with increasing distance. Fluctua- 
tions in pattern descriptions for the new maps were 
dependent upon those pixels retained and upon those 
pixels selected for removal. In a very small area, the 
selection of the initial pixel had a profound effect on 
the semivariogram. 

Sample unit size. Sampling intensity was reduced as 
spatial scale and pixel size increased. The interplay be- 
tween pixel size and landscape area influenced several 
aspects of landscape description. When the pixels 
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were small in relation to patch size, the landscape was 
sampled with pixels small enough to lie wholly within 
those patches. This resulted in little change in the dis- 
tribution of patches; landscape pattern description re- 
mained similar to the preceding level. Small pixels in- 
creased the likelihood that a patch would be retained 
into the next resolution level. As pixel size increased 
but remained small enough that there was little chance 
that the pixel would cross patch or land class bound- 
aries, a shift in patch distribution occurred with minor 
changes in land class proportions. 

As pixel size reached and exceeded patch size, re- 
tention of patches and land classes became unpre- 
dictable; there was either a loss of patches or a coales- 
cence of like-patches. This loss (or gain) contributed 
to the variance observed at the coarse-resolution lev- 
els. Spatial relationships among the land classes were 
also altered with large pixels. A marked loss in land- 
scape detail occurred and continued as pixel size ex- 
ceeded the known median patch and land class areas. 

Pixel size set a minimum threshold for land class fea- 
tures. An increase in this threshold at each resolution 
level influenced the heterogeneity and the patch metrics 
through the loss of landscape features. Pixel size set the 
minimum edge-to-edge distance between patches; the 
interpatch distance could not be less than the current 
pixel size. Figure 11.2 presents the relationship between 
spatial resolution and sampling intensity. 

The progression of the evenness index across the 
range of resolution levels is depicted in Figure 11.3. 
(The Simpson index, not shown, exhibited a similar 
trend.) The continuity of both metrics during changes 
in scale was unique among the other pattern metrics. 
Values remained fairly stable until a 270-meter pixel 
size was reached. At that point, both indices became 
quite variable with each resolution reduction, depend- 
ent upon the conditions encountered. Large pixels ei- 
ther omitted some land classes or resulted in dramatic 
proportional changes within the remaining classes. 
The vertical axes of the trend graphs (and subsequent 
figures where noted) were scaled by dividing metric 
values at each resolution level by the values obtained 
at the 30-meter resolution level (EXPym / EXP39,). 
Scaling the axes to 1.0 allowed direct comparison (i.e., 
making them unitless) across several different metrics. 

The influence of spatial resolution on metric behav- 


No. Sampled Pixels (Thousands) 


30 90 150 210 270 330 390 
Pixel Size (m) 


Figure 11.2. The relationship between spatial resolution (pixel 
size) and sampling intensity (number of sampled pixels). 


ior was influenced by landscape complexity. For sim- 
ple landscapes, the evenness metric becomes unpre- 
dictable as pixel size increases: one large pixel con- 
tributes to proportional evenness, the next pixel may 
reduce the index value. In contrast, there was a consis- 
tent trend for those values computed from the com- 
plex landscapes. The loss of one land class from a 
complex landscape having forty-five classes represents 
a 2 percent loss, while the loss of one land class from 
a simple landscape with nine land classes represents an 
11 percent loss. 

The horizontal line between 30 and 210 meters on 
the semivariogram indicates intervals where the spa- 
tial pattern is stable and not scale dependent. The 
abrupt semivariance rise following 270 meters indi- 
cates that values measured beyond this level are be- 
coming different. The marked reversal of slope after 
reaching a maximum is uncommon (Oliver and Web- 
ster 1986), suggesting that further fluctuation in even- 
ness values may occur beyond 420 meters. 

Reducing spatial resolution resulted in striking in- 
creases for both the interspersion and binary compari- 
son matrix metrics (Fig. 11.3). Logarithmic regression 
curves were computed for the relationship between 
metric behavior and pixel resolution. Interspersion in- 
creased at a constant rate (Interspersion = -8.32 + 
2.54 LN Resolution; r2 = 0.95) and binary compari- 
son matrix behaved similarly (binary comparison ma- 
trix = -6.79 + 2.14 LN Resolution; 72 = 0.95). Values 
obtained at 420-meter levels were six to eight times 
greater than baseline values, a magnitude far surpass- 
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Figure 11.3. The influence of spatial scale on the heterogeneity indices with their corresponding semivari- 
ograms. Plot markers identify simple (Square) and complex (diamond) landscapes. Vertical axes are scaled 
to 1.0 for direct comparison. Trend lines are derived from regression analysis. 


ing that observed for the Simpson and evenness in- 
dices. Interspersion and binary comparison matrix are 
based on a three-by-three roving window that covers 
proportionally more area with each resolution level. 
Landscape complexity also influenced metric behav- 
ior; values routinely overestimated the spatial hetero- 
geneity of simple landscapes. 

The interspersion semivariogram depicts the classic 
semivariance curve. (Binary comparison matrix, not 
shown, emulated interspersion). Spatial dependency 
exists at each resolution until 390 meters (a sill) is 
reached. Each successive landscape map has a pattern 
directly dependent on the resolution at which it is 


measured. The change in form beyond 390 meters ap- 
proximates the range. 

The reduction in spatial resolution resulted in a 
marked decline in convexity and total forest edge 
length (Fig. 11.4). Edge detail was suppressed at each 
successive resolution, reflected by the exponential re- 
gression trend lines for forest edge (LN Forest Edge = 
0.039 — 0.001 Resolution; 7? = 0.973) and the loga- 
rithmic trend for convexity (Convexity = 1.60 — 0.156 
LN Resolution; 7? = 0.971). The edge length and con- 
vexity measured at the coarsest resolution was 50 
percent less than baseline. Each change reflected the 
loss of edge detail as patch shapes were repeatedly 
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Figure 11.4. The influence of spatial scale on the edge metrics with their corresponding semivariograms. 
Plot markers identify simple (Square) and complex (diamond) landscapes. Vertical axes are scaled to 1.0 
for direct comparison. Trend lines are derived from regression analysis. 


simplified. The more complex the landscape pattern, 
the greater the potential for the loss of edge detail. 
The semivariograms for edge and convexity show a 
tendency toward a linear trend in semivariance with 
small-scale noise, suggesting weak spatial dependence. 
Semivariance change is almost imperceptible between 
30 and 90 meters. The semivariogram for forest edge 
length exhibits unbounded variance (Curran 1988). 
Convexity semivariance beyond 360 meters fluctuates 
near 0.01 (a possible sill). These values are considered 
spatially independent if this fluctuation continues. 
The influence of changing spatial scale on forest 
contiguity and the fragmentation indices are depicted 
in Figure 11.5. Fragmentation Index I values at the 
coarsest resolution were 170 times greater than base- 
line values, the magnitude of which was unsurpassed 
by any other landscape metric (LN Fragmentation 
Index I = -6.31 + 1.89 LN Resolution; 72 = 0:998). 
This index expresses the ratio between polygon num- 
ber and total map pixels. With each resolution reduc- 
tion, the denominator decreases fourfold. In contrast, 
changes in polygon number (the numerator) are grad- 
ual in comparison. This resulted in an exponential in- 


crease for the index over all landscapes. The semivari- 
ogram illustrates nonstationary processes (Oliver and 
Webster 1986); the range for finite variance is not 
reached. 

Dramatic changes were also observed for the Frag- 
mentation Index II (the mean distance to nonforest 
pixels). The repetitive loss of forest boundary detail 
and small forest clearings resulted in an increase in 
metric values with each resolution change (LN Frag- 
mentation Index II = —0.116 + 0.004 Resolution; 
1? = 0.884). Application to complex landscapes re- 
sulted in a fivefold overestimation, while values com- 
puted from simple landscapes were three times base- 
line values. Values rose slightly approaching 180 
meters and increased quickly with succeeding changes 
in resolution. Semivariance does not reflect the scale of 
measurement within this interval. Peak height and 
spacing mirror the different patterns emerging during 
scale changes; distance to nonforest was quite variable 
within these periods. 

Changing spatial scale resulted in a steady decline 
in contiguity values for all landscapes (LN Contiguity 
= 0.206 - 0.057 LN Resolution; r2 = 0.914). The 
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Figure 11.5. The influence of spatial scale on forest contiguity and the fragmentation indices with their 
corresponding semivariograms. Plot markers identify simple (square) and complex (diamond) landscapes. 
Vertical axes are scaled to 1.0 for direct comparison. Trend lines are derived from regression analysis. 


mean rate of information loss for simple landscapes 
was 9 percent, while that for complex landscapes was 
17 percent. Contiguity values were increasingly vari- 
able with each succeeding resolution, dependent upon 
the distribution of remaining forest pixels. Spatial de- 
pendency existed until 270 meters; semivariance then 
changes direction until a range is reached at 390 
meters. 

Figure 11.6 illustrates the relationship between spa- 
tial resolution and percent forest cover (Forest Cover 
= 0.991 + 0.0003 Resolution; 72 = 0.827). Metric val- 
ues remained relatively constant through 90 meters, 


increasing until the coarsest resolution was reached. 
The distribution of forest and nonforest within a land- 
scape influenced the variability observed for forest 
cover estimation. Forest cover was overestimated on 
continuous forest landscapes, particularly forest areas 
compact in shape. Elongated or irregularly shaped 
nonforest areas were lost quickly during the scaling 
process, resulting in the overestimation of forest 
cover. In contrast, landscapes with compact areas of 
nonforest (e.g., agricultural fields) often retained these 
areas during the resolution changes. 

The variability of forest cover estimation is evident 
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Figure 11.6. The influence of spatial scale on percent forest cover and forest interior with their correspon- 
ding semivariograms. Plot markers identify simple (square) and complex (diamond) landscapes. Vertical 
axes are scaled to 1.0 for direct comparison. Trend lines are derived from regression analysis; the line for 
forest cover represents both simple and complex landscapes. The dashed line on the forest interior semi- 


variogram indicates a possible sill. 


by the slope reversals in the semivariogram. The vari- 
able period height reflects the fluctuating distribution 
of forest pixels emerging from each change in resolu- 
tion. At fine resolution levels, semivariance does not 
reflect the scale of measurement; as resolution be- 
comes coarser, semivariance rises. 

Figure 11.6 also depicts the relationship between 
spatial scale and percent forest interior. Forest inte- 
rior was quite sensitive to changes in resolution; the 
net change in values was negative (LN Forest Interior 
= 0.588 - 0.170 LN Resolution; 7? = 0.744). Metric 
values fell at 60 meters and continued to drop 
throughout each level. This steady reduction was a 
function of the loss of edge detail as forested areas 
were repeatedly simplified, the shrinkage of forested 
areas, and the relationship between forest buffer and 
pixel size. There was a greater likelihood for the loss 
of forest interior on complex landscapes. At the 
coarsest resolution, values were reduced by 50 per- 
cent compared to a 25 percent reduction measured 


on simple landscapes. The loss of interior forest ac- 


celerated where several nonforested areas were dis- 
persed throughout a landscape. 

The forest interior semivariogram depicts scale de- 
pendence between 30 and 230 meters. Semivariance 
values beyond 230 meters fluctuate slightly above and 
below 0.0075 (the sill); values beyond this distance are 
considered spatially independent (Palmer 1988). 

Figure 11.7 illustrates the association between spa- 
tial resolution and the number of forest patches. 
(Mean patch size and interpatch distance, not shown, 
depict a similar trend.) The variability observed for 
the patch metrics was higher than that for any other 
metric group. Since each stage of the scaling process 
was independent of the last, the opportunity for disap- 
pearance or coalescence of forest patches varied with 
each resolution. 

The influence on forest patch number was negative; 
45 percent of the patches were lost by the coarsest res- 
olution level (LN Forest Patches = 0.088 — 0.002 Res- 
olution; 7? = 0.858). Patch detection was suppressed 
as pixel size exceeded patch size; the rate of patch loss 
was greater on landscapes comprising several small 
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Figure 11.7. The influence of spatial scale on forest patch metrics. Plot markers identify simple (square) 


and complex (diamond) landscapes. Vertical axes are scaled to 1.0 for direct comparison. Trend lines are 
derived from regression analysis; the line for number of patches represents both simple and complex 
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landscapes. 


patches than on landscapes composed of a few large 
patches. Landscapes with clumped patch arrange- 
ments lost those patches at a slower rate than those 
landscapes with uniform patch distributions. The as- 
sociated semivariogram highlights the reversal of vari- 
ability that occurs at several levels; a finite semivari- 
ance level was not reached. 

The disappearance of forest patches during changes 
in spatial resolution also influenced mean patch size 
and interpatch distance. When pixel size was less than 
patch size, the likelihood was high that the patch 
would remain during the next resolution. As pixel size 
reached and exceeded patch size, retention of those 
patches became unpredictable, resulting in an 7? of 
0.441 for mean patch size and an 7? of 0.184 for in- 
terpatch distance. As the resolution decreased, mean 
patch size could increase (e.g., the selected patch area 
is magnified by the new pixel size) or decrease (e.g., 
the nonselected patch area is replaced by nonforest 
pixels). Patch distribution and size influenced inter- 
patch distance. If a patch was removed during the 
scaling process, mean interpatch distance declined. 
However, if a patch proximal to another patch is re- 


moved, mean distance rises. Each patch metric poses 
predictability problems at different spatial scales. 

The semivariograms for mean patch size and inter- 
patch distance show prominent semivariance fluctua- 
tion, suggesting that the pattern described varies con- 
tinuously with scale. The stability of the semivariance 
observed between 30 and 90 meters suggests that 
patch size exceeded pixel size. Semivariance for inter- 
patch distance increased immediately following the 
30-meter baseline, reflecting the influence of patch 
number and distribution. 


Discussion 


Landscape pattern reflects the number of distinctive 
classes, their sizes, shapes, and the distances between 
them. While observing how pattern analysis changes 
as a function of spatial resolution, particular features 
appear to repeatedly influence patterns observed at 
coarse resolution levels. A consideration of these fac- 
tors may aid in determining the adequacy of resolu- 
tion levels for describing landscapes (Wehde 1982). 
The size and shape of land classes influenced their 
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persistence during changes in spatial scale. Compact- 
shaped classes had a higher probability of being re- 
tained during resolution reductions than those land 
classes occurring as elongate features. When forest 
patch size is larger than the current pixel, there is rela- 
tively little change in the distribution or proportion of 
land classes. When pixel size exceeds patch size, patch 
retention became unpredictable and the spatial rela- 
tionships between land classes altered. 

The spatial arrangement of land classes and forest 
patches also influenced their retention. Rare land 
classes distributed in patchy arrangements disap- 
peared more rapidly than did contiguous classes. 
Classes that were clumped disappeared slowly with in- 
creasing pixel size and classes that existed as large, 
continuous units were retained at coarse levels. In con- 
trast, the loss of pattern information was pronounced 
at coarse levels when one or more land classes were 
scattered as small patches or distributed in a noncon- 
tinuous manner. 

The changes observed during spatial scaling were 
influenced by landscape complexity. Dale et al. (1989) 
found that the rate of information loss was greater for 
complex landscapes; however, the loss of spatial het- 
erogeneity was more striking on simple landscapes. 
The addition or omission of a single land class on a 
simple landscape resulted in substantial changes in 
both the Simpson and evenness indices. The intersper- 
sion and binary comparison matrix metrics regularly 
overestimated the heterogeneity of simple landscapes, 
suggesting that scale problems occur in spatially ho- 
mogeneous landscapes. 

There were pattern metrics that were not influenced 
by landscape complexity. The rate of pattern loss for 
the fragmentation indices, percent forest cover, and 
the patch metrics was influenced by factors other than 
complexity (e.g., the distribution of land classes and 
the pixel-patch size relationships). 

There were instances where pattern loss was greater 
on complex landscapes (e.g., forest-edge length, con- 
vexity, forest interior, and forest contiguity). These 
landscapes quickly lost edge and shape detail with the 
continual smoothing of land class boundaries that oc- 
curred during changes in scale. These metrics were 
also influenced by the shifting distribution of forest ac- 
companying boundary changes. 


General Trends in Landscape Pattern Change 


Modeling spatial scale is a homogenizing process from 
which emerge several trends. A pattern metric may ei- 
ther increase or decrease depending upon the charac- 
teristics of a landscape image at a particular resolu- 
tion. Spatial scaling systematically favors land classes 
for retention or exclusion based on their area and lo- 
cation. The sampling process also excludes land 
classes based on their characteristic size and shape. If 
a forest patch or other land class area drops below the 
resolution of the landscape image, the resulting pat- 
tern description is modified dramatically. 

Landscape features that do remain after the scaling 
process are methodically altered in size and shape. The 
subsequent analysis of these generalized images results 
in land class omission or the exaggeration of areas in 
relation to their true areas on the ground. 

Much of the change observed with transformation 
of spatial scale centers on the loss of discernable detail 
and includes the following: removal of small inclu- 
sions and patches results in reassignment to another 
land class; enlargement of these features results in area 
overestimation. Spatial heterogeneity is averaged out, 
masking localized variability. Boundary smoothing re- 
sults in the loss of edge and shape detail; the shifting 
of boundaries distorts land class proportions as edge 
pixels are transformed from one land class to another. 
Pixel redistribution influences the connectivity of for- 
est and other land classes. The spatial arrangement of 
land classes is altered, while the relationships between 
like pixels are obscured (Chou 1991). 

The changes observed may vary with the general- 
ization algorithm. However, information will be lost 
when using taxonomic, spatial, or sampling general- 
ization. With reduction in resolution, error is intro- 
duced by discarding different features of the original 
pattern (Hole and Campbell 1985); each change can- 
not ensure that landscape features will be represented 
at coarse resolution levels. 


Variability of Pattern Metrics and 
the Semivariogram 


Modeling variance may illuminate the influence of 
spatial resolution on landscape pattern, while the 
use of mean estimates for pattern metrics may mask 


landscape variation. The timing and magnitude of 
variability with respect to spatial resolution provides 
useful information for landscape analysis. Several ap- 
proaches are based on the observation that variance 
increases as transitions are approached in hierarchical 
systems (O'Neill et al. 1986). If variance is measured 
as a function of spatial resolution, increased variance 
may indicate the approach of a change in pattern 
(Turner et al. 1989b). 

Metric variability associated with changing grain 
size may define the limits within which meaningful ex- 
trapolation is possible. Extrapolation error will differ 
among the metrics selected based on their behavior. 
The semivariogram assesses metric constancy; the sill 
and the range together serve to identify scales where 
extrapolation is possible and intervals producing com- 
parable landscape descriptions. 

Semivariograms have been used for describing the 
variation in vegetation structure (Palmer 1988) and 
digital imagery (Weishampel et al. 1992); for detecting 
patterns in canopy structure (Cohen et al. 1990); for 
optimal resolution determination (Atkinson et al. 
1990); and for mining applications (Curran 1988). In 
this study, the semivariogram has proven to be a use- 
ful tool for identifying how spatial resolution influ- 
ences landscape pattern description. 

The semivariogram parameter, yp, describes the 
magnitude of a metric's variation, while the form of 
the curve provides insight into the nature of that vari- 
ation. It is useful for identifying metrics that change 
direction predictably and those that are unpredictable. 
In addition to identifying the inherent variation within 
spatial scaling, semivariogram analysis may suggest 
the sampling resolution required for characterizing 
landscape pattern. 

Several pattern metrics may be candidates for ex- 
trapolation to other scales. The predictable behavior 
displayed by the binary comparison matrix, intersper- 
sion, Simpson, and evenness metrics suggests their po- 
tential for extrapolation. Other metrics showing con- 
sistent rates of change with bounded variance over 
partial intervals include forest contiguity, convexity, 
and forest edge length. It is possible that these metrics 
could be extrapolated between 30 and 240 meters. 

Semivariogram analysis also identified intervals 
where the semivariance was stable as grain size 
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changed. These intervals suggest the potential for ex- 
trapolation within that interval, presenting the limits 
of tolerance for choosing resolution level. For exam- 
ple, landscape description was equivalent between 30 
and 90 meters for forest cover and the number of 
patches. 

The use of semivariograms identifies metrics that 
exhibit dramatic variability during the spatial scaling 
process. For metrics demonstrating unbounded vari- 
ance (e.g., fragmentation indices I and II), landscape 
description cannot be made at one spatial resolution 
based on another. The variability observed for percent 
forest interior and forest cover values at different lev- 
els and spatial resolution indicates that the choice of 
resolution needs to be carefully considered prior to 
their use for landscape analysis. Patch metric variabil- 
ity indicates that pattern misrepresentation can occur 
at coarse resolution levels; the semivariograms imply 
the potential error associated with extrapolation may 
be great across spatial scales. Examining the semivari- 
ograms of landscape metrics demonstrates the impor- 
tance of using variance to discriminate metric useful- 
ness at different spatial scales. 


Spatial Scale and Wildlife Habitat Assessment 


The landscape metrics examined here are valuable de- 
scriptors of wildlife habitat. Their behavior at differ- 
ent spatial scales highlights the importance of under- 
standing how pattern influences species occurrence. 
Only at the proper spatial scale do metrics have po- 
tential meaning for resource managers. A metric that 
overestimates a desirable habitat component at coarse 
resolution levels concludes in an inflated assessment of 
habitat potential. Similarly, a metric that minimizes 
undesirable habitat components results in an overrat- 
ing of habitat condition. 

One objective of habitat relationship research is to 
identify biologically important variables that have the 
ability to predict species occurrence (Young and 
Hutto, Chapter 8). Habitat quality for many species 
contains a spatial component related to the arrange- 
ment and amount of habitat elements across the land- 
scape. If the spatial scale of measurement alters the 
values of these variables, this influences our ability for 
predicting species occurrence and for assessing habitat 
conditions. 
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TABLE 11.2. 


The influence of spatial scale on landscape pattern description and wildlife habitat evaluation. 


Influence on 


Common 

Species name Key referencea Landscape metric Metric® Evaluation 

Canis latrans Coyote Thomas 1979 Simpson Index Increase Overestimation 

Urocyon cinereoargenteus Gray fox Fritzell 1987 Binary Comp. Matrix Increase Overestimation 

Phasianus colchicus Pheasant Yahner 1988 Interspersion Increase Overestimation 

Accipiter nisus Sparrowhawk Hunter 1990a Forest Edge Length Decrease Underestimation 

Catharus fuscescens Veery ` Hamel 1992 Forest Edge Length Decrease Overestimation 

Martes pennati Fisher Rosenberg and Fragmentation Index | Increase Underestimation 

Raphael 1986 

Meleagris gallopavo Eastern wild Gustafson et al. 1994 Distance to Nonforest Increase Underestimation 
turkey 

Martes americana Marten Bissonette et al. 1989 Distance to Nonforest Increase Overestimation 

Seiurus aurocapillus Ovenbird Sweeney and Dijak 1985 Percent Forest Cover Variable Unpredictable 

Wilsonia citrina Hooded Whitcomb et al. 1981 Percent Forest Interior Decrease Underestimation 
warbler 

Dryocopus pileatus Pileated Schroeder 1983b Mean Patch Size Variable Unpredictable 
woodpecker 

Streptopelia turtur European Van Dorp and Interpatch Distance Variable Unpredictable 
turtle dove Opdam 1987 

Ursus americanus American Rudis and Tansey 1995 Forest Contiguity Decrease Underestimation 
black bear 


TT SS eee 


4Source of observations citing the importance of landscape pattern. 


binfluence of spatial scale when measured from fine to coarse levels of resolution. 


Table 11.2 suggests how landscape description at 
different spatial scales may influence wildlife habitat 
evaluation. The examples indicate how failure to con- 
sider spatial scale may result in misleading habitat as- 
sessments. For instance, coyotes (Cazis latrans) and 
gray fox (Urocyon cinereoargenteus) use a diversity of 
habitats. Both species hunt in open shrub habitat for 
small prey and den in downed logs in old-growth 
stands. At coarse resolution levels, habitat suitability 
is overestimated using spatial diversity metrics such as 
the binary comparison matrix and Simpson indices. 

The arrangement and distribution of vegetative 
communities (interspersion) is important to several 
species. Pheasants forage in agricultural fields, prefer- 
ring fields juxtaposed with hay crops for nesting and 
roosting cover. Coarse resolution levels overestimate 
the value of this landscape characteristic. 

Forest edge length reflects the transition between 
different community types. Edge characteristics are 
also important descriptors for raccoon (Procyon lotor) 


habitat. This species uses edge habitat for foraging 
and travel (Pedlar et al. 1997). The European spar- 
rowhawk (Accipiter nisus) and the Eurasian badger 
(Meles meles) are also frequently associated with edge. 
Habitat suitability is underestimated for each of these 
species when landscapes are analyzed using relatively 
large pixels. 

Fragmentation occurs when a tract of forest or 
other habitat is broken into patches. This process in- 
fluences a number of species. Fisher (Martes pen- 
nanti), gray fox, and ringtail cat (Bassariscus astutus) 
are sensitive to fragmentation (Rosenberg and 
Raphael 1986). Large pixels result in significant in- 
crease in the distance-to-nonforest and other fragmen- 
tation indices, introducing error into habitat assess- 
ment for these species. The hooded warbler, the 
worm-eating warbler (Helmitheros vermivorus), and 
the black-and-white warbler (Mniotilta varia) are also 
intolerant of fragmentation. The habitat for these for- 


est interior specialists is underestimated at coarse reso- 
lution levels. 

The persistence of many species may be linked to 
the number, size, and degree of isolation of forest 
patches. Patch isolation is often a significant predictor 
of relative abundance for many bird species. However, 
habitat suitability assessment is unpredictable using 
patch metrics derived from large pixels. Habitat con- 
nectivity facilitates dispersal and enhances habitat 
quality by connecting patches of critical habitat. For- 
est contiguity is an important explanatory variable for 
predicting habitat quality for black bear (Ursus ameri- 
canus). Landscape analysis at coarse resolution levels 
reduces contiguity estimates, underestimating habitat 
suitability for this species. 

Several studies have examined the influence of spa- 
tial scale on habitat assessment. Applying a marten 
(Martes americana) habitat model across several reso- 
lution levels, Schultz and Joyce (1992) documented 
changes in home range suitability prediction. Increas- 
ing pixel size resulted in simplified landscapes and the 
loss of small forest patches; the ability to recognize pat- 
tern detail was lost. The mean area of suitable home 
range was greater at fine resolution levels. Schultz and 
Joyce (1992) found that changes in forest patch size 
and distribution modified habitat assessment. 

Laymon and Reid (1986) investigated the influence 
of resolution level on a spotted owl (Strix occidentalis) 
habitat suitability model. Using 4-hectare pixels re- 
sulted in accurate predictions of home range use when 
compared to known owl locations. In contrast, 16- 
hectare pixels masked small habitat pockets of impor- 
tant value. The usefulness of model prediction de- 
clined appreciatively with the coarseness of spatial 
scale. In another study on home range estimation, 
Hansteen et al. (1997) assessed the effect of spatial 
resolution (0.06—100-meter pixels) on estimates of 
tundra vole (Microtus oeconomus) occurrence. Poly- 
gon estimates of home range size increased substan- 
tially as a function of resolution; large pixel sizes pro- 
duced estimates of home range area in locations where 
voles were absent. 

Johnson et al. (Chapter 12) reported that spatial 
scale influenced variable approximation relative to 
amphibian occurrence. Both local habitat parameters 
and landscape structure (fragmentation and edge 
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contiguity) were important for treefrog prediction. 
The relationship between species occurrence and ex- 
planatory variables was a function of the scale of 
analysis. 

Stoms (1992) examined the effects of resolution 
change on species richness prediction in Sierra Nevada 
landscapes. As pixel size increased, there was a decline 
in the number of mapped habitat types and the num- 
ber of predicted species. When landscape elements 
dropped below the resolution of the coverage, richness 
dropped dramatically. The use of 100-hectare pixels 
resulted in a substantially different pattern of species 
richness than that observed for smaller pixels. 

MacNally and Quinn (1998) found that monitor- 
ing striated thornbill (Acanthiza lineata) at different 
spatial resolution levels (15—45-hectare pixels) caused 
dramatic fluctuations in density prediction. In a simi- 
lar example, Haufler et al. (19992) discuss a study that 
compared mapping resolution using 1,000-hectare 
and 9.5-hectare pixels. Using fine resolution levels, 
sixty-nine community types were detected that sup- 
ported ninety-eight bird species. In contrast, twelve 
community types were identified (supporting sixty- 
seven species) using 100-hectare pixels. 

The choice of spatial scale has a profound influence 
on both the strength and the nature of wildlife-habitat 
relationships. Analyzing landscapes at an inappropri- 
ate scale may not accurately describe existing pattern 
but instead reflect artifacts of the scale of measure- 
ment. Whether the scale at which landscape character- 
istics are measured matches the scale at which those 
attributes influence wildlife species remains an impor- 
tant question. The measurement of spatial pattern in 
turn has implications for the prediction of species 
occurrence. 


Selecting the Appropriate Spatial Scale 


Landscape pattern description is but one source of 
uncertainty for predicting species occurrence (Mau- 
rer, Chapter 9). Measurement error in landscape pre- 
dictor variables can bias metric values leading to un- 
reliable predictions of species occurrence. Pattern 
analysis is most useful when the scale of analysis 
matches the scale at which species use the landscape. 
For example, Cogan (Chapter 18) discusses critical 
biodiversity sites that are detectable only when using 


154 PREDICTINGSSPECIES OCCURRENCES 


fine resolution levels. These remnant areas have high 
habitat value and are important for predicting rare 
plant populations. 

It is unlikely that a single scale of spatial resolution 
can be identified for all species in a given landscape 
(MacNally and Quinn 1998). Consideration should be 
given to species needs (e.g., habitat preferences and 
the distribution of those habitats). For small mam- 
mals, the arrangement and patchiness of herbaceous 
understory may be essential, while for a gray fox, the 
interspersion of vegetation communities on a land- 
scape may be important. Determining the appropriate 
scale for describing pattern and for predicting species 
occurrence should consider the differences in the 
home range sizes, geographic distributions, and habi- 
tat requirements. Laymon and Reid (1986) reported 
that a pixel size comprising 1-2 percent of home range 
was useful for predicting spotted owl habitat. For 
other species, the need for travel corridors or a diver- 
sity of forest types may suggest another scale of analy- 
sis. The heterogeneity of the landscape is yet another 
consideration. Young and Hutto (Chapter 8) detected 
fine resolution patterns of occurrence within cover 
types for Swainson’s thrush (Catharus ustulatus). Dif- 
ferences in site occupancy reflected fine-scale variation 
in habitat characteristics. This knowledge could be 
used to guide managers toward an appropriate spatial 
scale for this species. 

It is evident that species habitat requirements (e.g., 
minimum area limitations) will affect the choice of 
spatial scale. Species exploit resources within a differ- 
ent range of scales; large species have home ranges en- 
compassing hundreds of kilometers, while small 
species use modest ranges. The choice of spatial scale 
(grain and extent) for wildlife habitat evaluation 
should reflect species range, seasonal area use, and the 
landscape characteristics influencing its lifestyle. 

There is no single correct scale for describing a sys- 
tem (Levin 1992). The appropriate scale will depend 
on management objectives, the species involved, the 
underlying landscape pattern, and the processes be- 
lieved to be important. If the assessment focuses upon 
interior species habitat, the loss of boundary detail ac- 
companying coarse scales may yield unacceptable re- 
sults. The conservation of rare habitats dictates fine 
resolution analyses whereas large-area analyses may 


suffice for monitoring wetland loss. Patterns of habitat 
association for many species may be dependent on the 
scale of investigation. Different processes will be im- 
portant at different spatial scales. 


Management Implications 


Landscape analysis is a compromise between spatial 
scale and the accuracy of pattern description. Often 
the resolution of mapping data is based upon avail- 
able funds, storage, or data availability. However, it 
is important to consider the level of effort and the 
quality of the resulting analysis. Landsat imagery 
(30-meter pixels) may adequately differentiate pat- 
tern for many species but may be too coarse for 
small mammals or too detailed for black bear habitat 
assessment. 

The ability to assess environmental conditions will 
reflect the resolution of the data used for analysis. The 
changes that occur as resolution is reduced may not 
produce acceptable results; analyzing patterns at 
coarse resolution levels accompanies the loss of local- 
ized variability and boundary detail. A manager 
should recognize how much detail will be retained 
during landscape analysis and whether this detail is 
sufficient for the species of concern. The loss of detail 
is related to the number of sampled pixels used for de- 
scribing a landscape. Selecting resolution presents 
many of the same problems observed with selecting 
sample size. Coarse resolution levels result in small 
samples and associated problems of induction. 

This chapter proposes that the relationship between 
pattern and species occurrence may be altered when 
applied at different spatial scales. This suggests a num- 
ber of steps for the resource manager. First, geostatis- 
tical techniques such as semivariogram analysis can 
identify transitions between spatial scales. Second, a 
sensitivity analysis can identify whether the same pat- 
tern emerges at different levels of resolution. If pattern 
changes dramatically at a particular resolution level, 
the decision is made to lose the previous level of detail 
or to reverse course and select a finer resolution level. 
Spatial analysis, as outlined above, can be useful for 
selecting the appropriate scale for landscape analysis. 

Management objectives are related to ecosystem 
processes. If an objective is to mimic natural distur- 


bance regimes using silvicultural treatment, fine resolu- 
tion analyses may be required. Conversely, if a man- 
agement objective is to describe the pattern of regional 
defoliation, coarse resolution analyses may be ade- 
quate. In either case, spatial resolution is guided by the 
phenomena of interest and the management objective. 
Economic constraints will continue to dictate the 
nature of landscape analyses; the objective is to maxi- 
mize the information obtained at an acceptable level 
of confidence. Spatial scale should be the first consid- 
eration in landscape analyses, since scale directly influ- 
ences the nature of the final results (Meentemeyer and 
Box 1987). Selecting a spatial scale for providing spe- 
cific information should accompany consideration of 
the pattern information excluded from that scale. Un- 
derstanding those limitations will guide the selection 
of appropriate scales to meet management needs. 
There is a growing demand for landscape analysis, 
driven by the premise that ecological processes can be 
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predicted by pattern (Gustafson 1998). The relationship 
between landscape pattern analysis and spatial scale has 
important implications for both resource management 
and wildlife habitat evaluation. Selecting the appropri- 
ate scale (extent and resolution) may be critical for suc- 
cessful landscape analyses on forested landscapes and 
for the prediction of species occurrence. 
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Predicting the Occurrence of Amphibians: 
An Assessment of Multiple-scale Models 


Catherine M. Johnson, Lucinda B. Johnson, 
Carl Richards, and Val Beasley 


NL reports of local and regional amphib- 
ian population declines recently have focused 
attention on factors potentially influencing the distri- 
bution and abundance of amphibians (see Blaustein 
and Wake 1990; Pechmann et al. 1991; Wake 1991; 
Blaustein et al. 1994; Green 1997). These declines are 
generally attributed to habitat loss and fragmentation; 
yet, in many cases it is unclear whether declines are 
human induced or simply represent natural popula- 
tion fluctuations (Pechmann and Wilbur 1994; 
Blaustein et al. 1994). Furthermore, the status of a 
given species may be dependent on the spatial scale of 
interest, with some species appearing to be in decline 
in a given geographic area while remaining relatively 
constant at other scales of spatial occurrence (Pech- 
mann and Wilbur 1994; Hecnar and M'Closkey 
97. 

The distribution of local amphibian populations 
may be greatly influenced by site-specific habitat fac- 
tors (e.g., wetland water chemistry, vegetative compo- 
sition, and interspecific interactions); however, a 
broader perspective is required to understand factors 


affecting regional populations (Hecnar and M'Closkey 


1996). Landscape-level land-use practices can have 
both direct and indirect effects on wetland habitats 
and amphibian populations (Green 1997; Lehtinen et 
al. 1999). In regions dominated by intensive agricul- 
tural practices and urbanization, wetlands have been 


destroyed at an alarming rate, with some states losing 
more than 80 percent of their original wetland acreage 
since 1780, primarily as a result of agricultural activi- 
ties (Dahl 1990). Extant wetlands in such stressed 
landscapes also may be modified to the extent that 
they become unsuitable for many species as a result of 
pollution, introduced species, vegetative composition 
changes, altered hydrologic regimes, or other anthro- 
pogenically induced changes. 

Most regional amphibian populations could be de- 
scribed as metapopulations (Levins 1969, 1970; Han- 
ski and Gilpin 1991) whose stability is dependent on a 
balance of population extinction and colonization 
rates. As such, the extinction of many individual pop- 
ulations due to local or broad-ranging habitat 
changes, with no concurrent colonization of nearby 
sites, could have a long-term negative effect on re- 
gional populations. The loss and fragmentation of 
nearby upland habitats, such as forest and grassland, 
also may render a wetland unsuitable for many am- 
phibian species, especially those that spend most of 
their life cycle in those habitats. Furthermore, changes 
in broad land-use patterns that affect the ability of in- 
dividuals to move between local amphibian popula- 
tions could ultimately result in the loss of a given 
species over a broad geographic area. Such move- 
ments appear to be facilitated by certain landscape 
features, such as stream corridors, and obstructed by 
others, such as roads and highly fragmented habitats 


To 
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(Seburn et al. 1997; deMaynadier and Hunter 1998; 
Gibbs 1998; Lehtinen et al. 1999). 

Since both local habitat parameters and landscape 
structure could contribute to declines in amphibian 
populations, a multiscale perspective is needed to ob- 
tain a holistic view of amphibian species status. Stud- 
ies conducted at several spatial scales can provide a 
better resolution of the scale(s) at which these species 
are responding to environmental heterogeneity and 
the interrelationships among scales (Wiens 19892). 
However, few studies have examined amphibian pop- 
ulations at both site-specific and large-area scales 
simultaneously, and assessments of the effects of land- 
scape structure are even more rare. Such an under- 
standing is critical if we hope to predict the potential 
effects of future land-use changes on regional and 
global populations. 

To assess the relative influence of local habitat vari- 
ables, landscape structure, and anthropogenic influ- 
ences on anurans, we analyzed the relationship be- 
tween species occurrence and habitat parameters at 
three spatial scales. Scale, in the context of this manu- 
script, refers to the extent over which observations are 
made. Three scales of explanatory variables were in- 
cluded in this study: site-specific (i.e., physical habitat 
parameters), local landscape (within 2 kilometers of a 
site), and broad landscape (within 10 kilometers). The 
objectives of our analyses were to: (1) assess the im- 
portance of various explanatory variables in predict- 
ing species occurrences over a broad geographic area, 
and (2) compare the predictive abilities of models de- 
rived using site-specific data versus landscape data. 


Study Area 


The study extends from central Minnesota through 
Wisconsin and northeastern Illinois and is entirely lo- 
cated within the Eastern Broadleaf Forest Eco- 
Province (Bailey 1983; Fig. 12.1). Current land cover 
within this region ranges from forest to intensive agri- 
cultural or urban land uses. 


Site Selection 


Thirteen regional watersheds were randomly selected 
from a list of candidate fourth-order watersheds in the 
study region. Candidate wetlands (predominantly 


palustrine emergent, open water, and aquatic bed 
types) were identified on National Wetland Inventory 
(NWI) and Wisconsin Wetland Inventory maps. Field 
reconnaissance was conducted to select four to eight 
wetlands in each watershed that represented a range in 
wetland quality (based on visual assessments) and a 
gradient of surrounding land-use types. Most sites 
were rejected because they no longer existed, we were 
unable to acquire permission to access them, or they 
were threatened by development (e.g., close proximity 
to new development, or impending sale of the land). 


Methods 


We conducted night-time anuran calling surveys at 
each wetland three times during the 1998 and 1999 
field seasons (in March-April, May-June, and late 
July) and once in 2000 (early April) to determine 
species richness and relative abundance levels at each 
site. These surveys were conducted in accordance with 
the North American Amphibian Monitoring Program 
(NAAMP) protocol (Lannoo 1998a) to ensure consis- 
tency of anuran species richness and abundance data. 
Each set of calling surveys was initiated in the south- 
ern part of the study region and progressed north- 
ward, and all wetlands in a watershed were surveyed 
during the same evening. During each survey, air and 
water temperature and ambient weather conditions 
were recorded in addition to the presence and abun- 
dance ranking for each frog species. 

Daytime visits were made to each site, in conjunc- 
tion with evening calling surveys, to gather data re- 
garding the physical, chemical, and biological charac- 
teristics of the wetlands. Wetlands were characterized 
in terms of their size, general morphology, hydrologic 
regime, dominant wetland class, and vegetative com- 
position. Aerial photographs and aerial compliance 
slides (U.S. Dept. of Agriculture, County Soil Conser- 
vation Service Offices) were examined to assess juxta- 
position to other wetlands and potential anthro- 
pogenic stressors and to describe surrounding land-use 
types. 

Water-quality measurements conducted during each 
of the surveys included pH, alkalinity, conductivity, 
ammonia, nitrate, temperature, and dissolved oxygen. 
These measurements were made immediately upon ar- 
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Figure 12.1. Study area, including sixty-two wetland sites in thirteen regional watersheds located in Minnesota, Wiscon- 


sin, and northern Illinois. 


rival at a given site using a water-quality probe (YSI 
600XL sonde with a 610 dm data logger, Yellow 


Springs Instruments, Yellow Springs, Ohio). Maxi-’ 


mum water depth and change in water level since the 
previous visit also were measured during each site 
visit. 

During 1998, macroinvertebrates were collected 
using time- and area-constrained dipnet surveys (mod- 


ified from Lenat 1988); samples were sorted and iden- 
tified in the field to the lowest-possible taxon. During 
the third survey, voucher specimens were collected 
from each site, preserved in a 10 percent ethyl alcohol 
solution, and identified to family in the laboratory. 
Landowner survey data were used in conjunction with 
dipnet surveys to determine whether fish were present 
in each wetland. 
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Amphibian larvae and eggmasses encountered dur- 
ing dipnetting surveys were identified whenever possi- 
ble. Visual encounter surveys (Crump and Scott 1994) 
also were conducted for anurans in conjunction with 
daytime site visits. During June-July 1998 and July 
1999, an attempt was made to capture metamorphos- 
ing frogs at all sites. These species data were combined 
with the calling survey, dipnet, and visual encounter 
survey information to develop a list of frog species 
present at each site. 


Landscape Data 


Local and broad-scale spatial data, including land use, 
land cover, hydrography, wetlands, roads, Quaternary 
geology, and watershed boundaries were incorporated 
into a geographic information system (GIS) database. 
Land cover data were obtained from the Illinois Nat- 
ural History Survey (Illinois Department of Natural 
Resources 1996), Upper Midwest Gap Analysis Pro- 
gram (Lillesand et al. 1998), and Minnesota Land 
Management Information Center. Wetland data were 
acquired from the National Wetlands Inventory 
(NWI) and the Wisconsin Wetland Inventory. 

The U.S. Census Bureau's TIGER (1995) data were 
used to describe roads and hydrography. Other re- 
gional spatial data were collected from the USGS 
7.5-minute digital elevation models (DEMs) and the 
Quaternary Geologic Atlas of the Conterminous 
United States. Ancillary landscape data (e.g., finer- 
grain land-use and land-cover data in the immediate 
vicinity of study sites) were collected during site visits 
and incorporated into the overall spatial database. 
Landscape structure, including fragmentation pat- 
terns, patch density, connectivity, nearest-neighbor 
distances, and other landscape metrics, was quantified 
using FRAGSTATS (McGarigal and Marks 1995). 


Analysis 


We used Systat 7.0.1 (SPSS 1997) for all statistical 
analyses. Explanatory variables (i.e., wetland charac- 
teristics and landscape variables) were screened using 
Principle Components Analysis (PCA) and Pearson's 
correlation coefficients to remove colinearity and re- 
duce the total number of variables used in subsequent 
analyses. The remaining explanatory variables were 
checked for normality and appropriate transforma- 


tions were applied where necessary. Variables were 
then separated into three groups: (1) site-specific wet- 
land characteristics, (2) local landscape variables, and 
(3) large area landscape variables. Site-specific data in- 
cluded both physical and biotic measurements taken 
during on-site field visits. Because of the diurnal and 
seasonal variability in water-quality parameters, and 
because we were interested in relative differences be- 
tween sites rather than absolute values, we used only 
ranks of the maximum or minimum values for these 
parameters in our analyses. Local-scale landscape 
variables included spatial data within a 1—2-kilometer 
buffer around each site, collected via aerial-photo in- 
terpretation and available state and national digital 
data sets. Broad-scale landscape variables included 
similar spatial data within a buffer of 10 kilometers. 

To assess whether any explanatory variable was 
consistently associated with presence or absence of in- 
dividual anuran species, simple relationships were ex- 
amined for all species using Spearman rank correla- 
tions. Sites were grouped by presence or absence of 
each species. The Mann-Whitney U-Test was used to 
test for significant differences in explanatory variables 
between groups (occupied versus unoccupied sites) for 
eight of the ten species encountered in this study: the 
chorus frog (Pseudacris triseriata), American toad 
(Bufo americanus), Cope's gray treefrog (Hyla 
chrysoscelis), eastern gray treefrog (Hyla versicolor), 
spring peeper (Pseudacris crucifer), wood frog (Rana 
sylvatica), leopard frog (Rana pipiens), and green frog 
(Rana clamitans). Analyses of wood frog occurrences 
were confined to data from eight watersheds located 
in northern Wisconsin and Minnesota because of this 
species’ limited geographic range in the study area 
(Casper 1996; Harding 1997). The ranges of the bull- 
frog (Rana catesbeiana) and mink frog (Rana septen- 
trionalis) each incorporated only three to five of the 
watersheds in the study area, and so these species were 
eliminated from all analyses. 

To determine whether the habitat relationships ob- 
served at the eco-province scale (i.e., our entire study 
area) also occurred over smaller areas, we reexamined 
the species occurrence data within two geographic re- 
gions, hereafter referred to as bioregions, based on 
amphibian community faunal regions as described by 
Brodman (1998). The southern grouping included 
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TABLE 12.1. 
Species occurrences (presence or absence within each site) by watershed, from April 1998 to April 2000. 
Watershed 
(na) Bufame® Hylchr Hylver Psucru Psutri Rancat Rancla Ranpip Ransep Ransyl 
1 (5) 5 di 0 1 [5 S 3 dl 0 0 
2 (4) 3 4L O 0 3 S |4 3 0 0 
3 (4) 4 4 2 2 3 (0) 4 4 (0) 0 
4 (8) 6 7 8 6 7 2 8 6 0 0 
5 (4) 3 2 3 3 3 0 4 2 0 Eli 
6 (4) al 3 0 4 3 0 4 4 0 0 
TB) 2 1 Eli 3 3 (0) 2 2 0 3 
8 (5) e al 4 4 3 0 4 4 al 3 
95) 4 4 2 aL 5 0 4 5 (0) al 
10 (4) e al 0 al 3 (0) al 4 (0) al 
LINT) 3 6 5 7 TÉ 0 5 6 0 4 
12 (4) 2 2 2 3 2 (0) aL 4 á 2 
13 (5) 3 5 3 2 3 0 2 5 2 5 
All Sites (62) 41 38 30 ein 50 8 46 50 4 20 


an refers to the number of sites in each watershed. 
bSee note in table 12.2 for Latin names. 


thirty-two sites in seven watersheds located in Illinois 
and southern and eastern portions of Wisconsin. The 
northern group consisted of thirty sites in six water- 
sheds, including all those in Minnesota and the most 
northwestern watershed in Wisconsin. 

In addition to examining correlations among species 
occurrences and explanatory variables, habitat occu- 
pancy models were developed for the spring peeper, 
gray treefrog, and wood frog using logistic regression. 
These species were selected for model development be- 
cause the other species encountered occurred at either 
very few (20 percent or fewer) or most sites (80 percent 
or more), or were habitat generalists. Since we wanted 
to develop parsimonious models that could easily be 
interpreted, we considered only main effects in the 
model. Beginning with a model that included all ex- 
planatory variables remaining following exploratory 
analyses, we manually removed and added variables to 


achieve a best-fit model, similar to an automated step- 


wise regression procedure. Significance of variables in- 
cluded in the final model for each scale was assessed at 
an alpha level of 0.05 (Hosmer and Lemeshow 1989). 
Goodness-of-fit of each logistic model was reflected by 
both the percentage of sites correctly classified (the cor- 


rect classification rate or CCR) and the McFadden’s 
Rho-squared value (p2). Three scale-specific models 
were developed for each species using each of the three 
groups of explanatory variables (i.e., site-specific, 
local-landscape, and broad-landscape scales), as well as 
one that incorporated variables from all three scales 
(Allvars). The models were compared for each species 
to determine the ability of data from each scale, and 
from all scales combined, to predict occurrence. Final 
models were tested against a reserve data set of sixteen 
sites added to the study in 1999. 


Results 


Ten anuran species were observed in the sixty-two wet- 
lands included in this study. The number of sites these 
species occurred in ranged from four (mink frog) to fifty 
(leopard and chorus frogs; Table 12.1). Over two hun- 
dred explanatory variables were assembled from field 
and spatial data collection and spatial data analyses. 
After removal of collinear variables, the number of ex- 
planatory variables was reduced to sixty-two, including 
eighteen site-specific, twenty-six local-landscape, and 
eighteen broad-landscape variables (Table 12.2). 


TABLE 12.2. 


Individual explanatory variables showing a significant difference between sites with or without each of eight anuran species. 


Ransyl . 


Scale 


Variable 


Bufame 


Psutri Rancla 


Ranpip 


Hylchr 


Hylver Psucru 


Site-specific 


2 Km 


10 Km 


Natural 

WetlandType 
Topography 

Rowcrop50 

Woods50 
MaxdepthDecr 
MinWaterTemp 

MinpH 

MinConductivity 
MaxAmmn 

MinDOsurf 

Totallnsect 

Fish 
AWMeanShapelndex 
SimpsonEvenIndex 
ContagionIndex 
PatchLandSimillndex 
PatchEdgeContrastIndex 
EmergentWet1k 
PalustrineWet2k 
EmergentWet2k 
PalustrineWetCount1k 
PatchDensityAg 
EdgeContrastIndexAg 
PatchFractalDimAg 
Intersp/juxtaposAg 
PercentUrban 
PatchDensityUrban 
Intersp/juxtaposUrban 
PercentForest 
CoreAreaCVForest 
Intersp/juxtaposForest 
PercentWetland 
PatchDensityWetland 
PatchFractalDimWetland 
PatchFractalDimOW 
CoreAreaCVOW 

No. Patches 
ContrastWtEdgeDensity 
MeanShapelndex 
NearestNeighborCV 
Intersp/juxtaposLandsc 
PalustrineWet10k 
Freeway 

LocalRoads 
PatchFractalDimHDUrban 


*ck 


** 


** 


* * 


* 


* 


*ck 


** 


xk 


** 


* 


* 


** 


KK 


*ck 


*ck 


*ck 


xk 


*ck 


KK 


*ck 


*ok 


xk 


* k 


x k 


kk 


** 


OK 


xk 


OK 


*ok 
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TABLE 12.2. (Continued) | 


Individual explanatory variables showing a significant difference between sites with or without each of eight anuran species. 


Scale Variable Bufame Psutri Rancla Ranpip Hylchr Hylver Psucru Ransyl 
10 Km Intersp/juxtaposHDUrban * *ck ** c * 
PatchDensityAg * 
CoreAreaCVGrass sog ** 
PercentForest Xo OK 
PatchDensityForest *cek ** 
Intersp/juxtaposForest Tra P 
PercentOW * *** * 
PatchDensityOW KK eK * 
EdgeContrastindexOW ok 


Note: Mann-Whitney p-values: P< .05 = *, P< .01 = **, and P< .001 = ***. Scale refers to the spatial extent from which the vari- 
ables were derived: site-specific, local landscape (2 kilometers) or broad landscape (10 kilometers); descriptions of each variable 


are given in the Appendix. 

Bufame - American toad, Bufo americanus 
Psutri = chorus frog, Pseudacris triseriata 
Rancla = green frog, Rana clamitans 

Ranpip = leopard frog, Rana pipiens 

Hylchr = Cope’s gray treefrog, Hyla chrysoscelis 
Hylver = eastern gray treefrog, Hyla versicolor 
Psucru = spring peeper, Pseudacris crucifer 
Ransy! = wood frog, Rana sylvatica 


Individual Species-Habitat Relationships 


Spearman rank correlations between each species and 
explanatory variable are presented in Table 12.1. 
Although no two species exhibited the same pattern of 
relationship with all variables, five species did show 
similar patterns for many variables. The treefrogs, 
spring peeper, wood frog, and leopard frog had simi- 
lar relationships (i.e., either negative or positive) with 
47 percent of the variables assessed. Green-frog occur- 
rences were inversely related (i.e., had an opposite 
sign) for the majority of the variables that the afore- 
mentioned species had in common. 

Results of the Mann-Whitney U-test indicated that 
the occurrence of six species was significantly associ- 
ated with variables in all three spatial scales (Table 
12.2). The exceptions were the American toad, which 


showed no strong correlation with any of the local. 


variables measured and with only one of the 10- 
kilometer variables, and the chorus frog, which exhib- 
ited a significant difference in occurrence associated 
with only one explanatory variable. The spring peeper 


and gray treefrog exhibited many similarities in the 


pattern of relationships with explanatory variables, es- 
pecially with the landscape metrics. 

When the study area was split into northern and 
southern bioregions for analysis, no consistent rela- 
tionship between a given species occurrence and the 
explanatory variables was observed for the two re- 
gions at either the site-specific or the local-landscape 
scale. Significant relationships between explanatory 
variables at these scales were almost completely dif- 
ferent for the two regions examined for both the 
spring peeper and green frog (Table 12.3). Only the 
broad-scale landscape variables (i.e., 10 kilometers) 
showed some similar relationships for the two 
groupings. 


Site Occupancy Models 


The best-fit logistic regression models for individual 
species had McFadden's Rho-squared (p?) values rang- 
ing from only 0.09 to 0.64 and correct classification 
rates (CCRs) ranging from 56 to 85 percent (Tables 
12.4—6). The spring peeper CCR values for the site- 
specific, local landscape, and broad-landscape models 
were 57, 79, and 68 percent, respectively (Table 12.4). 
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TABLE 12.3. 


Explanatory variables with significant Mann-Whitney associated with the occurrence of spring peepers (Pseudacris crucifer) or 


green frogs (Rana clamitans) in two ecoregions. 


NS 


Spring Peeper Green Frog 

Scale Variables NW SE Variables NW SE 

Site-specific Natural um WetlandType ios 
Woods50 * MinWaterTemp dus 
MinpH $ MinpH x 
MinConductivity E MaxAmmn oy 
MaxNitrate l * MinDOsurf v 
Totallnsect às 

2 Km AWMeanShapelndex ae EmergentWet1k $ 
PatchEdgeContrastIndex ** EmergentWet2k 2 
PalustrineWet2k 7h PalustrineWetCountik + 
EdgeContrastindexAg mk EdgeContrastindexAg E 
Intersp/juxtaposAg RS MeanPatchFractalDimAg i 
Intersp/juxtaposUrban d s Intersp/juxtaposForest ^ 
PercentForest * 

PercentWetland kt 
PatchDensityWetland * 

10 Km MeanShapelndex * ContrastWtEdgeDensity wi 
NearestNeighborCV B NearestNeighborCV g v 
Intersp/juxtaposLandsc x m Intersp/juxtaposLandsc m 
PalustrineWet10k ee Intersp/juxtaposHDUrban n 
PatchfractIDimHDUrban S PatchDensityForest x 
CoreAreaCVGrass a EdgeContrastindexOWw s 
PercentForest * ** l 
PatchDensityForest + 
PatchDensityOW dob oer 


Note: NW refers to sites in Minnesota and northwestern Wisconsin (n = 30); those in the southeast include SE Wisconsin and Illinois (n = 32) 


Scale refers to the spatial scale from which the explanatory variables were derived; descriptions of variables are given in the Appendix. P-values (P 


<1 .05=*, P< .01 = **, and P< .001 = xo). 


However, when explanatory variables for all three 
scales were incorporated into model development, the 
CCR increased to 85 percent; the general fit of the 
model (as described by the p? value) also increased. 
This model also correctly classified 87.5 percent of 
sites in the reserve data set (Table 12.7). This final 
model (Allvars) included both site-specific variables 
(artificial versus natural wetland and distance to the 
nearest wetland) and landscape variables (number of 
wetlands within 1 kilometer, wetland patch and agri- 
cultural edge contrasts, landscape diversity, and urban 
patch interspersion). In comparing the different scale 


models, the local landscape (1-2 kilometer) model 
was the best predictor of spring peeper presence, cor- 
rectly classifying 21 percent more of the sites than did 
the local model. 

The site-specific scale model for the gray treefrog, 
like that for the spring peeper, was the poorest predic- 
tor of species site occupancy (56.4 percent; Table 
12.5). However, while the local landscape model was 
the best individual-scale predictor for the spring 
peeper, the broad landscape variables provided a bet- 
ter model for predicting gray treefrog occurrence 
(71.7 versus 60.6 percent). The Allvars model for this 
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TABLE 12.4. 


Results of four habitat occupancy models for spring peepers (Pseudacris crucifer), developed using variables from three discrete 
spatial scales, and a combined set of variables. 


Scale n p2a CCR> Variablec Sign P 
Site-specific 62 .087 57:3 MinConductivity — js 
2 Km 62 .509 78.8 SimpsonEvenIndex - joie 
Intersp/juxtaposUrban + prio 
PalustrineWetCount1k - s 
EdgeContrastindexAg + pii 
EdgeContrastlndex - wies 
PalustrineWet2k -— i 
PercentWetland t v 
10 Km 62 .266 67.7 PatchDensityOW + KEk 
CoreAreaCVGrass + Tus 
Allvars 62 .644 85.1 EdgeContrastindexAg + PN 
PalustrineWetCount1k - "us 
SimpsonEvenlIndex - * 
PatchEdgeContrastIndex = * 
Intersp/juxtaposUrban t as 
WetlandDist + we 
Natural Gs i 


Note: Explanatory variables and their individual p-values are presented for each model. 
&aMcFadden's rho-squared (p?) is an estimate of the fit of the overall model. 

bCCR represents the percentage of correctly classified sites. 

cDescriptions of variables are given in the Appendix. 


TABLE 12.5. 


Results of four habitat occupancy models for gray treefrogs (Hyla versicolor), developed using variables from three discrete spatial 
scales, and a combined set of variables. 


RR IMEEM aee.  —-— cl 5 5 5àÁÀ 5 n P (X 

Scale n p2a CCR> Variablec Sign P 

Site-specific 62 .294 56.4 Woods50 + wr 

2 Km 62 iL (AS 60.6 Intersp/juxtaposAg + Ghd 

10 Km 62 .387 TERU Intersp/juxtaposLandsc + NK 
PercentOW t * 

Allvars 62 .433 74.6 Intersp/juxtaposLandsc + ek 
Woods50 + Pun 


Note: Explanatory variables and their individual pvalues are presented for each model. 
aMcFadden's rho-squared (p2) is an estimate of the fit of the overall model. 

bCCR represents the percentage of correctly classified sites. 

‘Descriptions of variables are given in the Appendix. 


species included only two variables, one from the site- 74.6 percent). Similar results (CCR = 75 percent) were 
specific scale (presence of woods within 50 meters of ^ obtained when tested against the reserve data set 
the wetland) and one from the broad-landscape scale (Table 12.7). 

(landscape patch interspersion); this multiscale model The wood-frog models showed a different pattern 
again provided the best p? and CCR values (0.43 and than those of the spring peeper and gray treefrog. The 
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TABLE 12.6. 


Results of four habitat occupancy models for wood frogs (Rana sylvatica) developed using variables from three discrete spatial 


Scales and a combined set of variables. 


Scale n p2a CCR^ Variablec Sign P 
Site-specific 38 .230 64.8 Woods50 + Mk 
2 Km 38 .087 56.4 PatchEdgeContrastindex - a 
10 Km 38 21365 Sisi PatchDensityOW + UR 
Intersp/juxtaposForest - v 
Allvars Sl 1528 19:3 WetlandType_2 + * 
Woods50 + Mk 
PercentUrban2k - * 


Note: Because of the wood frog's geographic range, only sites in watersheds 5 and 7-13 were used for this analysis. Explanatory variables and 


their individual p-values are presented for each model. 


aMcFadden’s rho-squared (p2) is an estimate of the fit of the overall model. 


PCCR represents the percentage of correctly classified sites. 
*Descriptions of variables are given in the Appendix. 


percentage of correctly classified sites was higher for 
the site-specific model (64.8 percent) than for either 
the local-scale (56.4 percent) or broad-scale landscape 
model (58.7 percent; Table 12.6), though none of the 
individual-scale models performed well. All three of 
these models were relatively simple; the site-specific 
and local-landscape models each included only one 
explanatory variable (woods within 50 meters of the 
wetland, and patch edge contrast, respectively), and 
the broad-landscape model included two (patch den- 
sity of open water, and the forest interspersion and 
juxtaposition index). The combined model resulted in 
a much higher correct classification rate (79.3 percent) 
than any of the individual scale models and incorpo- 
rated both site-specific and local-landscape scale 
variables (wetland type, presence of woods within 50 
meters, and percentage of urban land within 2 kilome- 


TABLE 12.7. 


ters). This model also correctly predicted wood frog 
presence on 75 percent of the sites added in 1999 
(Table 12.7). 


Discussion 


The eight frog species considered in our analyses ex- 
hibited a wide range of associations with habitat vari- 
ables at all three spatial scales examined (site-specific, 
local-scale, and broadscale landscapes). The diversity 
of these relationships indicates that no single environ- 
mental factor or set of factors from those examined 
has an overriding influence on all or even most anuran 
species in this region. Although it is not surprising that 
each species is associated with a different suite of site- 
specific wetland characteristics, the variation in re- 
sponse to landscape variables, especially those related 


Percentage of sites correctly classified (CCR) using the final (Allvars) habitat occupancy models for 
spring peepers (Pseudacris crucifer), gray treefrogs (Hyla versicolor), and wood frogs (Rana 


sylvatica). 
= _ T ë 
Model data Test dataa 
Common name Species code n CCR n CCR 
Spring peeper psucru 62 85.1 16 87.5 
Gray treefrog hylver 62 74.6 16 17:80) 
Wood frog ransyl 38 79.3 16 75.0 


?Test data refer to those sites added to the study in 1999. 


D 
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to anthropogenic disturbances (e.g., indices of habitat 
fragmentation and road densities), was somewhat 
unexpected. 

Increased habitat fragmentation resulting from 
habitat destruction is considered one of the most im- 
portant factors causing amphibian declines in indus- 
trial regions (Blaustein et al. 1994). However, meas- 
ures of habitat fragmentation should be viewed 
cautiously in terms of their relationship to species dis- 
tributions. The landscape included in our study area is 
dominated by agricultural land (Fig. 12.2), with wet- 
lands, forested areas, and other *natural" habitats 
generally distributed as smaller patches throughout an 
agricultural matrix. The juxtaposition of many, rela- 
tively small nonagricultural patches interspersed 
within the agricultural matrix (rather than just a few 
larger *natural" patches) may benefit the long-term 
survival of local populations by decreasing the dis- 
tance frogs must travel across a hostile environment 
(e.g., agricultural fields) to reach other suitable habi- 
tats. Viewed in this light, the strong positive relation- 
ship between landscape patch interspersion and the 
occurrence of species such as the spring peeper and 
gray treefrog is consistent with what is known about 
the biology of these species. Although these anurans 
appeared to favor a more highly fragmented landscape 
in this region, their presence also was associated with 
a low degree of edge contrast (i.e., the land immedi- 
ately surrounding the wetland was not agricultural or 
urban). Both species also exhibited a strong tie to for- 
est cover at all three spatial scales. 

The results of these analyses indicate that use of 
wetlands as breeding habitat by many anuran species 
in the midwestern United States is dependent on ex- 
planatory factors operating at various spatial scales. 
In addition, the significance of the relationship be- 
tween these explanatory factors and species occu- 
pancy is strongly influenced by the geographic regions 
analyzed. For example, the relationships between 
spring peeper occurrences and explanatory variables 


shifted as the spatial scale of analyses changed from: 


the entire study area to the two bioregions assessed 
(Table 12.3). Although the presence of woods within 
50 meters of a site, the percentage of forest cover 
within 2 kilometers, and the density of forest patches 
within 10 kilometers all were significantly associated 
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Figure 12.2. Percentages of different land cover types in the 
study area. The northwest and southeast bioregions are based 
on amphibian community faunal regions as described by Brod- 
man (1998). 


with the presence of spring peepers in the southeastern 
bioregion, those relationships were not mirrored in 
the northwestern region. This may be due to the 
smaller overall percentage of forest cover in the south- 
eastern portion of the study area (Fig. 12.2). Lehtinen 
et al. (1999) also found differences in species-habitat 
relationships for the American toad at different geo- 
graphical scales of analysis. Although they observed a 
significant relationship between species occurrence 
and forest cover over the entire study area (twenty-one 
wetlands in central and southwestern Minnesota), that 
relationship was not detected when the study area was 
subdivided into prairie versus forest ecoregions. 

The spatial scale of these analyses also may ac- 
count, in part, for the relatively low correct site clas- 
sification rates observed in the three final logistic 
models. Because the relationships between species 
site occupancy and the explanatory variables differed 
greatly when viewed at a smaller geographic scale 
(i.e., southeastern and northwestern bioregions ver- 
sus the overall study area), it seems likely that differ- 
ent pressures are acting on the amphibian communi- 
ties in each region. The significance of variables that 
have a strong influence on species occurrence at a 
finer scale (e.g., the northwestern sites only), could 
be lost when sites in a different landscape matrix 
(e.g., southeastern sites) are included in the analysis. 
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Conversely, analyses that are limited to more-re- 
stricted geographic areas may miss broad-scale land- 
scape patterns that influence the ability of individu- 
als to disperse between wetland habitats and 
maintain regional populations. 

The predictive ability of our models probably 
could be improved with additional data, since our 
models were based on only two years of survey data 
and one year of environmental data. In a three-year 
study, Hecnar and M’Closkey (1997) found high 
turnover rates for green frogs in individual wetlands, 
with some sites occupied only one in three years. 
Similarly, Skelly et al. (1999) found that changes in 
the distribution of fourteen amphibian species were 
surprisingly common between surveys conducted in 
1967-1974 and 1988-1992. The physical and chem- 
ical properties of wetlands also may vary dramati- 
cally between years as a result of stochastic events, 
such that assessments based on one year of data are 
not necessarily characteristic of the general condition 
of the wetland. Finally, we may have failed to incor- 
porate, or used an inappropriate scale of measure- 
ment for, variables that are critical to site selection 
by these species. Given the importance of interpopu- 
lation proximity to the persistence of amphibian 
metapopulations (Sjógren-Gulve 1994), data regard- 
ing the distance to other occupied wetlands probably 
would increase the predictive ability of our models, 
as might inclusion of occurrence or abundance data 
for other species, especially potential predators or 
competitors. In addition, more detailed habitat data 
(e.g., dominant vegetation types, soil chemistry data, 
or continuous water chemistry measurements) could 
prove important in discerning differences between 
wetland habitats. 

Since there is no single correct scale at which mod- 
els should be developed, we must determine the ap- 
propriate scale of spatial analysis based on the prob- 
lem or question at hand (Wiens et al. 1986; Wiens 
1989a; Levin 1992). Many of the anuran species in- 
cluded in this study have both site-specific wetland re- 
quirements (e.g., wetland hydroperiod or water- 
quality tolerances) and surrounding landscape require- 
ments (e.g., wooded habitat within dispersal distance); 
therefore, a multiscale approach is necessary to under- 
stand the mechanisms controlling the abundance and 


distribution of these species’ populations (Goodwin 
and Fahrig 1998). The results of the logistic models 
developed in this study support such an approach, in- 
dicating that anuran species occurrence in wetlands 
can best be predicted with a combination of variables 
based on different spatial scales. Our analyses also in- 
dicate the importance of considering variables associ- 
ated with landscape structure (e.g., fragmentation and 
edge contrast) in addition to more traditional land- 
scape measures (e.g., percent cover). The interpreta- 
tion of such variables should be made with caution, 
however, and should include careful consideration of 
the consistency of the landscape matrix across the re- 
gion studied. 

Given the broad scale of our analyses, these models 
are of limited use to managers attempting to predict 
species occurrences at individual wetlands. As has 
been stressed repeatedly in this book (e.g., by Van 
Horne, Maurer, and Wiens [Chapters 4, 9, and 65, re- 
spectively]), the results of our study point to the limi- 
tations of predictive models in their spatial applica- 
tion. Relationships between species and individual 
explanatory variables varied when assessed at differ- 
ent spatial scales (extent) and when compared across 
different geographic regions at similar spatial scales. 
However, informative patterns emerge in the relation- 
ships between species occurrence and explanatory 
variables, and comparisons between geographic re- 
gions may be useful. For example, variables that con- 
sistently appear as significant predictors at varying 
spatial scales or geographic regions could be consid- 
ered important when developing broad-scale conser- 
vation plans. Managers concerned about localized 
population declines, on the other hand, may want to 
focus on variables whose negative associations with 
species occurrence are limited to their geographic area 
of concern. 
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Explanatory variables used in model development, including the abbreviation used and a brief description of each. Refer to 
McGarigal and Marks (1995) for more detailed explanation of these landscape metrics. 


Scale Variable Description 
Site- Natural natural vs artificial wetland 
specific Wetlandtype dominated by one of three wetland types (1 = wet meadow; 2 = marsh; 3 = 
shallow open water) 
Size wetland size 
WetlandDist distance to nearest wetland 
Topography subjective ranking of topographic relief around wetland—from 1 (relatively flat) 
to 4 (steep) 
Rowcrop50 presence/absence of row crops within 50m of wetland 
Woods50 presence/absence of woodland within 50m of wetland 
MaxDepthdecr max decrease in wetland water depth over the field season 
MinWaterTemp min wetland water temperature (water temp measured at each of three surveys) 
MinpH min wetland pH (measured at each of three surveys) 
MinConductivity min wetland conductivity (measured at each of three surveys) 
MaxAmmn max ammonium concentration (measured during first two surveys) 
MaxNitrate max nitrate (measured during first two surveys) 
MinDOsurf dissolved oxygen just below water surface (measured during last two surveys) 
Totallnsect number of insect families observed in wetland 
InsectOrder number of insect orders observed in wetland 
Gastropoda presence/absence of gastropods in wetland 
Fish presence/absence of fish in wetland 
2 Km AWMeanShapelndex area-weighted mean shape index for the landscape within 2 km of wetland 


SimpsonEvenlndex 


ContagionIndex 


PatchLandSimillndex 
PatchEdgeContrastIndex 


Roads1k 
EmergentWetik 


PalustrineWet2km 


EmergentWet2k 


PalustrineWetCount1k 


PercentAg 
PatchDensityAg 


study sites 

Simpson's evenness index for the landscape within 2 km 
landscape contagion index for areas within 2 km 
patch landscape similarity index 

patch edge contrast index 

weighted index of road density within 1 km 

area of emergent wetland within 1. km 

area of palustrine wetland within 2 km 

area of emergent wetland within 2 km 

count of palustrine wetlands within 1 km 
percent agricultural land within 2 km 

patch density of agricultural land within 2 km 


EdgeContrastindexAg 
PatchFractalDimAg 
Intersp/juxtaposAg 


area-weighted mean edge contrast index for agricultural areas within 2 km 
area-weighted mean patch fractal dimension for agricultural areas within 2 km 
index of juxtaposition and interspersion for agricultural areas within 2 km 
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Explanatory variables used in model development, including the abbreviation used and a brief description of each. Refer to 
McGarigal and Marks (1995) for more detailed explanation of these landscape metrics. 


Scale Variable Description 
PercentUrban percent urban land within 2 km 
PatchDensityUrban patch density of urban land within 2 km 
intersp/juxtaposUrban index of juxtaposition and interspersion for urban areas within 2 km 
PercentForest percent forested land within 2 km 
CoreAreaCVForest patch core area coefficient of variation (variation in size of forested core areas) 
Intersp/juxtaposForest index of juxtaposition and interspersion for forested areas within 2 km 
PercentWetland percent of emergent wetland habitat within 2 km 
PatchDensityWetland patch density of emergent wetlands within 2 km 
PatchFractalDimWetland mean patch fractal dimension of wetlands within 2 km (complexity of patch 
perimeters) 
PatchFractalDimOW mean patch fractal dimension of open water areas (ponds) within 2 km 
CoreAreaCVOW patch core area coefficient of variation (variation in size of pond core areas) 
10 Km #Patches number of patches in the landscape within 10 km of wetland study sites 
ContrastWtEdgeDensity contrast-weighted edge density of landscape within 10 km 
MeanShapelndex mean shape index of landscape within 10 km 
NearestNeighborCV nearest-neighbor coefficient of variation of landscape within 10 km 


Intersp/juxtaposLandsc 
PalustrineWet10k 
Freeway 

Localroads 
PatchFractDimHDUrban 


Intersp/juxtaposHDUrban 
PatchDensityAg 
CoreAreaCVGrass 
PercentForest 
PatchDensityForest 
Intersp/juxtaposForest 
PercentOW 
PatchDensityOW 
EdgeContrastindexoOW 


index of landscape juxtaposition and interspersion within 10 km 

area of palustrine wetland within 10 km 

measure of highway density within 10 km 

measure of local road density within 10 km 

area-weighted mean patch fractal dimension for high density urban areas within 
10 km 

index of juxtaposition and interspersion for high density urban areas within 10 km 
patch density of agricultural land within 10 km 

patch core area coefficient of variation (variation in size of grassland core areas) 
percent forested land within 10 km 

patch density of forested areas within 10 km 

index of juxtaposition and interspersion for forested areas within 10 km 

percent of open water habitat within 10 km 

patch density of ponds within 10 km 

area-weighted mean edge contrast index for ponds within 10 km 


Notes: Scale refers to the spatial extent from which the explanatory variables were derived (site-specific: physical, chemical, and biological wetland 
data derived from field visits; 2 km: spatial data for areas within 1-2 km of each wetland study site; and 10 km: spatial data for areas within 10 
km of study sites. 
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Dynamic Patterns of Association between 
Environmental Factors and Island Use 
by Breeding Seabirds 


Catherine M. Johnson and William B. Krohn 


He requirements for the establishment of 
seabird colonies include suitable breeding sites, 
adequate food supplies within foraging range, and 
safety from terrestrial predators (Birkhead and Fur- 
ness 1985; Cairns 1992). The presence of a seabird 
colony on an island indicates that the area meets these 
requirements and provides “suitable” habitat, whether 
of high or low quality. However, not all suitable sites 
may be occupied by a species at a particular time, and 
unoccupied sites may have substantial conservation 
value (Wiens 1996b). Since entire seabird colonies 
may temporarily or permanently abandon islands 
(Mendall 1936; Drury 1973; Buckley and Downer 
1992), the identification of suitable, unoccupied habi- 
tats can be important for long-term conservation plan- 
ning. Habitat occupancy models can be used to iden- 
tify characteristics associated with suitable habitats. 
They also may be effective tools for predicting which 
sites may be used in the future and for examining the 
potential effects of specific habitat management plans 
on a given species. However, the predictive ability of 
such habitat occupancy models may be limited to rela- 
tively large-area analyses (Rotenberry 1986). 

Since both physical habitat structure and social fac- 
tors affecting populations may change over time, the 
results of habitat models must be interpreted carefully 
with regard to temporal effects (O'Connor 1987b; 
Wiens 1989a; Krohn 1996; Van Horne, Chapter 4). 


The ability of an island and the surrounding aquatic 
environment to provide suitable habitat for a particu- 
lar species may change over time as the local environ- 
ment changes. Although more-plastic species may be 
able to cope with dramatic changes in the physical 
characteristics of an area, populations of species that 
are more closely tied to a specific food resource or 
habitat type may not be able to adapt to changing re- 
source levels. Habitat models generally are tied to 
population and resource levels and spatial distribu- 
tions that exist at a particular time. Though a habitat 
model may be highly robust for the time period in 
which it was developed, its ability to predict long-term 
habitat use is dependent on both changes in the envi- 
ronment and changes in a species’ population level 
over time (Krohn 1996). Thus, habitat suitability or 
quality assessments made at any one point in time 
may be misleading when used for long-term conserva- 
tion planning, especially in the case of highly stochas- 
tic environments such as the insular habitats found 
along the North Atlantic coast. 

The number and types of habitats occupied by a 
species may shift over time, with more varied habitats 
used as population densities increase (Svardson 1949; 
O'Connor 1986; Rosenzweig 1991). It is generally as- 
sumed that individuals will select the highest-quality 
available habitats first (Brown 1969b; Fretwell and 
Lucas 1969). At higher densities, individuals are likely 
to be found in suboptimal and sink habitats (Van 
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Horne 1983; O'Connor 19872; Pulliam 1988; Daniel- 
son 1992). As such, analyses of populations at peak 
densities could mask the existence of habitat prefer- 
ences, and assessment of habitat quality may be in- 
formative only at lower densities (O'Connor 1986). 
However, since most usable sites are occupied at high 
densities, habitat occupancy models could be used to 
identify the characteristics that distinguish suitable 
sites, of either high or low quality, from unsuitable 
sites. 

Coastal Maine has over three thousand islands 
and ledges, providing nesting habitat for a variety of 
seabirds and other waterbirds. The diversity of insu- 
lar and aquatic habitats in this area and the avail- 
ability of historic seabird census information make 
coastal Maine a good location for the development 
and testing of temporal seabird habitat models. The 
number of seabirds breeding on Maine's coastal is- 
lands has increased steadily since their near extirpa- 
tion at the beginning of the twentieth century, 
though populations of some species have declined 
since the mid-1900s (Drury 1973; Krohn et al. 1992; 
Johnson and Krohn 1998). The number of islands 
used by nesting seabirds also has increased; however, 
seabird colonies still occur on only about 10 percent 
of the islands along the Maine coast (Conkling 
1995). In addition, the availability of specific aquatic 
food resources (e.g., fish and shellfish) in the Gulf of 
Maine has changed drastically since the late 1800s 
(Ojeda and Dearborn 1990; NOAA 1991). The At- 
lantic cod (Gadus morbua), haddock (Melanogram- 
mus aeglefinus), halibut and smaller groundfish that 
supplied a superabundant food resource for cor- 
morants at the beginning of the twentieth century 
were decimated by the 1970s, and the Atlantic her- 
ring (Clupea harengus) and capelin (Mallotus villosus) 
populations that provided the primary food supply for 
the large alcid colonies along the Newfoundland coast 
were reduced to residual levels by the early 1980s 
(Mowat 1984). 

In a previous study, habitat (island) occupancy and 
colony size models were developed for five seabird 
species breeding in coastal Maine using 1977 survey 
data (Johnson 1998). The resultant models appeared 
to be robust when tested against reserve data from 
that time period. However, we wondered how accu- 


rate these models would be at predicting future habi- 
tat use for long-term conservation planning. In order 
to test the predictive ability of such models, we devel- 
oped temporally distinct island occupancy models for 
three seabird species and colony size models for one 
species based on survey data collected from several 
time periods. The data used to develop these models 
came from surveys conducted along the Maine coast 
at various times between 1941 and 1997. The vari- 
ables included in each temporal model for a given 
species were compared to determine the utility of these 
variables and the overall habitat models as long-term 
indicators of island occupancy. Models also were 
tested using data from the later surveys to determine 
their long-term predictive ability. 


Study Area 


The study area is located in mid-coastal Maine and 
extends from Pemaquid Point eastward to Schoodic 
Point (Fig. 13.1). This area encompasses over sixteen 
hundred islands and ledges, ranging in size from a few 
square meters to thousands of hectares. Approxi- 
mately 10 percent of these islands currently are known 
to support at least one seabird colony, defined as a 
group of at least five nesting pairs of a given species. 
Three of the twelve seabird species breeding in Maine 
were considered in this study: the double-crested cor- 
morant (Phalacrocorax auritus), common eider (So- 
materia mollissima), and great black-backed gull 
(Larus marinus). These species were chosen because of 
the availability of comprehensive, historical survey 
data for the study area. 


Selection Of Occupied and Unoccupied Islands 


All occupied islands in the study area are larger than 
0.01 hectare and smaller than 500 hectares in size. 
Larger islands (more than 500 hectares) generally are 
inhabited by people or mammalian predators, and the 
smallest islands (less than 100 square meters) provide 
little suitable habitat for seabird colony establishment. 
Thus, islands selected for use in this study were lim- 
ited to the size range of occupied islands; this limita- - 
tion allowed us to focus on other, less-obvious, habi- 
tat-selection factors. The islands that met this size 
criteria (approximately 1,130) were divided into occu- 
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Figure 13.1. Mid-coastal Maine study area, extending from Pemaquid Point and the Outer Islands eastward to 


Schoodic Point. a 


pied (by any of the species considered in this study) 
and unoccupied groups, based on 1976-1977 survey 
data (Korschgen 1979). Five hundred islands were 
randomly selected from within each group for analysis 
at a ratio of 1:3 (i.e., 125 islands from the occupied 
group and 375 from the pool of unoccupied islands). 
Aerial photographs (9x9, 1:12,000-scale color photos) 


were reviewed to eliminate islands that were con- 
nected to the mainland or populated islands (with 
roads) at high tide. After dropping islands for which 
aerial photographs were unavailable, a total of 306 is- 
lands were included in subsequent analyses. For indi- 
vidual species island occupancy models, all islands 
supporting a colony of that species were included, as 


174 PREDICTING SPECIES OCCURRENCES 


were an equal number of randomly selected, unoccu- 
pied (by that species) islands. 


Model Development 


Island incidence data and nest counts were used as re- 
sponse variables for island occupancy models and 
colony size models, respectively. Double-crested cor- 
morant incidence and nest count data were from sur- 
veys conducted during 1943-1944 (Gross 1944a), 
1976-1977 (Korschgen 1979), and 1994-1997 (John- 
son 1998). For common eiders, we used occurrence 
data collected during 1941-1943 (Gross 1944b), 
1976-1977 (Korschgen 1979) and 1984 (Maine De- 
partment of Inland Fisheries and Wildlife [MDIFW] 
unpublished data). Great black-backed gull occur- 
rences were based on surveys conducted during 1944 
(Gross 1945), 1976-1977 (Korschgen 1979), and 
1994-1996 (MDIFW unpublished data; Johnson 
1998). Nest count data used for this study were lim- 
ited to those made using ground nest counts for 1944 
and 1977 and aerial counts for 1997. Although these 
methodologies have different biases, restricting each 
time period to a consistent methodology ensured that 
all estimates within the same survey period were 
comparable. 

We used a geographic information system (GIS) 
(ArcInfo, Version 7.03, ESRI) to develop a spatial 
database of environmental and anthropogenic factors 
potentially associated with the use of islands in this 
area by nesting seabirds (Johnson 1998). Spatial vari- 
ables were chosen for inclusion in this database based 
on the general habitat requirements for each species 
considered and the practicality of obtaining the data. 
These variables fell into three broad categories: physi- 
cal island characteristics, foraging habitat, and human 
disturbance and predation. Refer to Johnson (1998) 
for an explanation of the methodology used to de- 
velop explanatory variables used in these habitat 
models. 

Logistic regression (Systat 7.0; SPSS Inc.) was used 
to develop species island occupancy models and linear 
regression for colony size models. All explanatory 
variables with non-normal distributions were trans- 
formed to approximate a normal distribution. For 
each species and survey period, all occupied islands 
and an equal number of randomly selected unoccu- 


pied islands were selected for model development. Ex- 
planatory variables were screened using Principle 
Components Analysis (PCA) and Pearson's correlation 
coefficient to remove colinearity and reduce the total 
number of variables used in the creation of habitat 
models. Variables associated with land use also were 
removed for the analysis of historical (pre-1976) is- 
land occupancy, due to a lack of information regard- 
ing these factors. 

Because we wanted to develop parsimonious mod- 
els that could be easily interpreted, we considered only 
main effects in the model. Beginning with a model that 
included all explanatory variables remaining following 
exploratory analyses, we manually removed and 
added variables to achieve a best-fit model, similar to 
an automated stepwise regression procedure. Good- 
ness-of-fit of each logistic model was reflected by both 
the percentage of correctly classified sites and the 
McFadden’s Rho-squared value (p2). Since the pri- 
mary objective of these models was to predict the 
probability of occupancy of an island, the percentage 
of correctly classified sites, or the correct classification 
rate (CCR), represents the most meaningful measure 
of model quality (Ryan 1997). However, when com- 
paring models, it also is useful to use McFadden’s p2 
values as a supplementary goodness-of-fit statistic 
(Hosmer and.Lemeshow 1989). The p? value ranges 
from 0 to 1, with higher values reflecting more signifi- 
cant results, similar to the R2 of a linear regression; 
however, p? values tend to be much lower than R2, 
with values over 0.20 considered to represent satisfac- 
tory models. 

After assessing the fit of each model for the time pe- 
riod for which it was developed, we tested that model 
against survey data from a later time period. For ex- 
ample, the model developed for cormorants based on 
survey data from 1944 was used to predict habitat oc- 
cupancy during 1977 and 1997. The percentage of 
correctly classified sites was determined for each time 
period and the predictive ability of the model across 
different time periods was assessed. 

Colony size models were developed for cormorants 
using multiple linear regression (Systat 7.0; SPSS). In. 
addition to explanatory variables, nest count data 
were transformed to approximate normal distribu- 
tions. À manual stepwise regression process also was 
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Explanatory variables considered in developing habitat models for double-crested cormorants (Phalarcrocorax 
auritus), common eiders (Somateria mollissima), and great black-backed gulls (Larus marinus). 


Coverage (m?) of residential or agricultural areas 
Absence of house, on or contiguous to island 


Distance to the nearest island greater than 50 ha in size 


Total area of aquatic habitat within 2 km of island 

Total area of water 21—50 m in depth within 2 km of island 

Total area of aquatic habitat within 4 km of island 

Total area of water 11-20 m in depth within 4 km of island 

Total area of water 51—100 m in depth within 4 km of island 
Total area of water less than 10 m in depth within 8 km of island 
Amount of marine aquatic bed habitat within 2 km of island 
Amount of estuarine habitat within 4 km of island 

Amount of estuarine habitat within 8 km of island 


Variable Type? Description 

Area C (log) Island size (m?) 

Perimeter C (log-base 10) Island perimeter 

Maxel C (log(log)) Maximum elevation (m) 

Ledge C (log) Coverage (m?) of ledge or boulders 
Cobble C (log) Coverage (m?) of cobble or gravel 

Rock C (log) Sum of Ledge and Cobble 

Sand C (log) Coverage (m?) of sand 

Lowcov C (log) Sum of Herb, Shrub, and Open coverages 
Forest C (log) Coverage (m2) of forest or woodland areas 
Disturb C (log) 

House O B 

Land O B Absence of suitable boat landing areas 
Mainmin C (square root) Distance to the mainland 

Roadmin C (cube root) Distance to islands with maintained roads 
Bigidist C (cube root) 

Minlf C (cube root) Distance to the nearest town landfill 
Boatmin C (cube root) Distance to the nearest boat launch 
Total2 C (none) 

Lev4km2 C (square root) 

Total4 C (none) 

Lev2km4 C (none) 

Levbkm4 C (log) 

Area8km C (cube root) 

Aqua2km C (square root) 

Estu4km C (log) 

Estu8km C (log) 

Anadmin C (log) Distance to the nearest anadromous 
Salmmin C (log) 


Distance to the nearest Atlantic salmon stream 


aType refers to continuous (C), rank (R), or binary (B) variables, with transformations used for analysis noted in parentheses. 


used for these models. The models were assessed using 
the adjusted R2 values and explanatory variables using 
their individual P values from the overall model. 


Results 


Data for a total of sixty-four explanatory variables 
were assembled for the islands used in this study. As 
a result of the exploratory data analyses, the number 
of explanatory variables was reduced to between 
nineteen and twenty-two for inclusion in develop- 
ment of each species’ habitat models; selection of 
final variables was based on a review of Pearson’s 


correlation coefficients between individual species’ 
survey data and explanatory variables. Table 13.1 in- 
cludes all variables used in at least one species model 
development. 


Double-crested Cormorant 


Predictive models were developed for the occupancy 
of islands by double-crested cormorants during three 
survey periods: 1944 (1942-1944), 1977 (1976-1977), 
and 1997 (1994-1997). In the earliest period exam- 
ined (DCCO44), occupancy was negatively associated 
with the extent of aquatic bed habitat within 2 kilo- 
meters of an island (AQUA2KM); this was the only 
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TABLE 13.2. 


Results of habitat occupancy models for double-crested cormorants (Phalarcrocorax auritus), common eiders 
(Somateria mollissima), and great black-backed gulls (Larus marinus) over discrete time periods. 


Overall model 


Explanatory variables 


Species Year N p2° CCR> Variable Estimate P 
Double-crested 1944 30 0.328 83.3 aqua2km — 0.258 .007 
cormorant 1977 102 0.627 84.3 forest —1.836 :955 
aqua2km -0.172 1024! 
lowcov 0.215 .039 
total2km 0.003 .020 
1997 106 0.635 89.6 lowcov 0.204 .048 
forest —1.847 .965 
house O T.973 .964 
maxel 4.125 .023 
aqua2km —0.225 .002 
Common eider 1943 33 0.490 84.8 lowcov 0.5385 OdT 
aqua2km -0.163 1055 
1977 158 0.582 86.1 lowcov 0.504 <.001 
bigisle 0.646 <.001 
house O 1.449 .001 
1984 159 0.651 86.2 lowcov 0.439 «.001 
house O 2.566 <,001 
total2km 0.003 .001 
aqua2km —0.140 .001 
Great black- 1944 38 0.796 lowcov 0.868 .014 
backed gull aqua2km 0.286 .065 
1977 128 0.703 90.6 lowcov 0.549 «.001 
forest -0.421 «.001 
bigisle 0.686 .001 
aqua2km —0.151 .005 
1995 134 0.664 92.5 lowcov 0.540 «.001 
forest —0.478 «.001 
total2km 0.002 .004 


8McFadden's rho-squared (p?) is an estimate of the fit of the overall model. 


PCCR represents the percentage of correctly classified sites. 


*Explanatory variables are presented along with their coefficients. 


meters of an island (AQUA2KM); this was the only 
variable included in the model. Eighty-three percent of 
the islands used in model development were correctly 
classified; however, the p? value (0.33) indicated only 
a satisfactory model fit (Table 13.2). 

The 1977 cormorant model (n = 102) showed a 
positive relationship between site occupancy and both 
the extent of herbaceous and low shrub cover (LOW- 
COV) and the total amount of water in the immediate 


vicinity (2 kilometers) of an island (TOTAL2KM). A 
negative relationship was apparent with both forest 
cover (FOREST) and AQUA2KM. This model was 
relatively robust, with a p2 value of 0.63 and CCR of 
84 percent. The 1997 model again indicated that the 
presence of nesting cormorants was positively associ- | 
ated with the amount of low cover and negatively as- 
sociated with FOREST and AQUA2KM. However, 
the remaining explanatory variables differed, includ- 


TABLE 13.3. 
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Comparison of the percentage of correctly classified sites (CCR) determined for island occupancy by colonies of double-crested 
cormorants (Phalarcrocorax auritus), common eiders (Somateria mollissima), and great black-backed gulls (Larus marinus) during 


different survey periods. 


Correctly classified sites (%) 


Species Model year? Test year N . Occupied Unoccupied Overall 
Double-crested cormorant 1944 1944 30 71.4 93.7 83.3 
1977 102 56.9 82.4 69.6 
1997 106 53.8 91.7 72.6 
1977 1977 102 78.4 90.2 84.3 
1997 106 i 80.8 88.9 84.9 
1997 1997 106 92.3 87.0 89.6 
Common eider 1943 1943 38 93.8 76.5 84.8 
197 158 79.7 84.8 82.2 
1984 159 80.8 92.6 86.8 
1977 1977 158 91.1 81.0 86.1 
1984 159 91:0 77.8 84.3 
1984 1984 159 89.7 81.5 86.2 
Great black-backed gull 1944 1944 38 94.7 94.7 94.7 
1977 128 8115 82:5 82.0 
1995 134 80.6 80.6 80.6 
1977 1977 128 95.4 85.7 90.6 
1995 134 94.0 Tf Syst 85.1 
1995 1995 134 91.0 94.0 92.5 


aModel year refers to the survey data used to develop that model. 
bTest year refers to the survey data used to test the model. 


(MAXEL) and the absence of houses on the island or 
contiguous islands (HOUSE_0). 

The presence of aquatic bed vegetation within 2 kilo- 
meters of an island was the most significant variable af- 
fecting nesting cormorant use of islands during all time 
periods examined. Despite this consistency, however, 
the ability of the 1944 model to correctly predict site 
occupancy in 1977 and 1997 was 10—14 percent lower 
than that for 1944 (Table 13.3), and the omission error 
rate was more than 40 percent for both periods. In ad- 
dition to AQUA2KM, the extent of low cover and for- 
est were important variables in both the 1977 and 1997 
models. Though these models were not identical, the 
1977 model was able to correctly classify occupancy in 
84—85 percent of sites in both 1977 and 1997 (Table 
13.3). In fact, the omission error rate for the 1977 
model (i.e., the percentage of occupied islands classified 
as unoccupied) was lower when tested against 1997 


data (Table 13.3). 


Colony Size Analysis 


The relationships between full surveys (i.e., all 
colonies surveyed in a given year) and explanatory 
variables indicated little temporal change in the rela- 
tionship between the number of nesting pairs on an 
island and explanatory variables over the different 
survey periods (Table 13.4). The 1944 and 1997 
models both included LOWCOV and LEV2KM4 
(the extent of coastal water 10-20 meters deep 
within 4 kilometers of an island), while the 1997 
model was limited to LOWCOV. Although the first 
model explained a significant amount of variability 
in nest counts (R2 = 0.66), neither the 1977 nor 1997 
model was particularly good R2 = 0.21 and 0.22, re- 
spectively). Since each model includes data for all is- 
lands with cormorants during that survey period, 
however, it is difficult to distinguish potential differ- 
ences between islands with long-established colonies 
and recently colonized islands. 
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TABLE 13.4. 


Comparison of the colony size models developed for double- 
crested cormorants (Phalarcrocorax auritus) using survey data 
from different time periods. 


Explanatory 


Survey year N R2 variablesa P 
1944 13 .662 lev2km4 « .005 
lowcov < .01 
1977 Sil 207 lev2km4 < .01 
lowcov < 05 
1997 50 .216 lowcov < .05 


aVariables defined in table 13.1. 


To elucidate the differences between older and more 
recently established colonies, and the changes that take 
place within a colony over time, we divided the data 
into three distinct island groups, based on the age of the 
colony, for further analysis. Specifically, DCCO44 was 
composed of those islands colonized by cormorants 
prior to 1945; DCCO77 included islands colonized 
after 1944 but prior to 1978, and DCCO95 was limited 
to those islands first colonized after 1977. In each case, 
the size of recently established colonies appears to be 
most strongly influenced by the aquatic (foraging) envi- 
ronment, specifically LEV2KMA (Table 13.5). The size 
of established colonies (twenty years or more), however, 


TABLE 13.5. 


is more closely tied to island size or the amount of low 
cover on an island. In the case of colonies established 
prior to 1945, island size alone was able to explain 54 
percent and 33 percent of the variability in colony sizes 
for 1977 and 1997, respectively, but only 2 percent of 
the variability in the size of colonies during 1944. 


Common Eider 


Two variables were included in the 1943 model for 
common eiders. The occurrence of nesting eiders was 
positively related to the amount of low cover and nega- 
tively associated with AQUA2KM (Table 13.2). The 
overall model fit (p2 = 0.49) was satisfactory and 84.8 
percent of sites were correctly classified. The presence 
of eiders in 1977 was positively correlated with increas- 
ing distance from islands more than 50 hectares in size 
(BIGISLE), the amount of low cover, and the absence of 
houses on or adjacent to an island (HOUSE 0). The 
overall fit for this model (p2 = 0.58) was fairly good and 
was supported by a CCR of 86 percent of the 158 sites 
included. The CCR for the 1984 model also was 86 
percent, with a p2 of 0.65. Like the 1977 model, both 
LOWCOV and HOUSE 0 were positively associated 
with the presence of eiders. However, distance to is- 
lands was not included in this model, being replaced by 


Comparison of the colony size models developed for double-crested cormorants (Phalarcrocorax 
auritus) using survey data from different time periods. 


Explanatory 
Year of colonization? Survey year N R2 variables? P 
Prior to 1945 1944 13 .662 lev2km4 < .005 
lowcov < .01 
1977 13 .764 area < 001 
lev5km4 < .005 
1997 12 .636 area <05 
lev2km4 < .05 
1945-1977 1977 38 .302 lev2km4 < .001 
roadmin « .05 
1997 38 .431 lowcov < .001 
1978-1997 1997 11 .860 roadmin < .001 
total2km < .05 
estu4km NS 


8Year of colonization refers to the first survey, conducted after 1900, for which a cormorant colony was 


noted on an island. 
bVariables defined in Table 13.1. 


a negative association with AQUA2KM and a positive 
association with TOTAL2KM. 

Variables included in all three eider models were 
similar, with LOWCOV the most important predictor 
in each. In addition, negative associations with 
AQUA2KM or the presence of houses on or adjacent 
to an island, each occurred in two of the three models. 
The predictive ability of these models also differed lit- 
tle across the time periods examined. The model de- 
veloped from 1943 data correctly predicted between 
82 and 87 percent of occupied islands for all three 
time periods assessed (Table 13.3). 


Great Black-backed Gull 


Like the earliest eider model, the 1944 model for 
black-backed gulls included two explanatory vari- 
ables, LOWCOV and AQUA2KM (Table 13.2). How- 
ever, this simple model had a very high p2 value (0.80) 
and correctly classified 95 percent of sites. The 1977 
model also included LOWCOV and AQUA2KM as 
variables but was more complex, including two other 
explanatory variables. Island occupancy for this time 
period was positively associated with increasing dis- 
tance from the mainland or islands larger than 50 
hectares and negatively associated with the amount of 
forest cover. This model also had a high p? value 
(0.70) and a CCR of 91 percent. The 1995 model 
again included LOWCOV and FOREST; however, the 
third explanatory variable was a positive association 
with the total amount of coastal water within 2 kilo- 
meters. This model had a p2 of 0.66 and a correct 
classification rate of 92.5 percent. 

As with the previous two species, a single variable, 
LOWCOV in this case, exerted the most influence on 
models across all three time periods. The ability of 
early models to predict future occupancy was rela- 
tively good (80-85 percent), but did exhibit a notice- 
able decrease across time (13-14 percent from 1944 to 
1977 or 1997; Table 13.3). 


Discussion 


Island Occupancy 


The extent of low cover on an island was the most sig- 
nificant explanatory variable for all eider and gull 
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models and also was significant in the latter two mod- 
els for cormorants. Along the Maine coast, both eiders 
and black-backed gulls use dense herbaceous or shrub 
cover for nesting sites (Blumton et al. 1988; C. John- 
son personal observation). However, cormorants pre- 
fer to nest on bare ground or in trees (Mendall 1936; 
Cowger 1976; Nettleship and Birkhead 1985). Since 
cormorant nests usually are not located in herbaceous 
or shrub vegetation, the importance of low cover may 
be associated with the protection it affords young 
from predators, especially gulls, when adults are ab- 
sent from the colony. Eiders nesting in dense cover 
have been shown to have lower predation and higher 
success rates than those nesting in less-concealed areas 
(Choate 1967). In addition, islands with dense herba- 
ceous or shrub cover may provide nesting birds with 
more protection from winds and other harsh environ- 
mental conditions than those with little or no vegeta- 
tive cover. 

A negative association with extensive areas of 
aquatic bed vegetation in the immediate vicinity of an 
island (i.e., within 2 kilometers) also was an important 
variable in the earliest model for each species. Previ- 
ous analyses (Johnson 1998) indicated that the pres- 
ence of aquatic bed habitat could be acting as a surro- 
gate for intertidal areas in the model. As such, this 
negative relationship may be associated with avoid- 
ance of predators. Larger islands are more likely to 
support populations of predatory mammals, and in- 
tertidal (i.e., low tide) connections to these islands 
could allow predators to move more easily onto 
smaller islands. 

The earliest models, when fewer than forty islands 
in the study area were occupied by each species, were 
quite simple, limited to one or both of the abovemen- 
tioned variables, AQUA2KM and LOWCOV. These 
variables appear to be related to two of the three pri- 
mary requirements for colony establishment: the pres- 
ence of suitable breeding sites and safety from terres- 
trial predators. 

The ability of the 1943 eider model to predict oc- 
cupancy forty years in the future was quite good (87 
percent). However, the predictive abilities of the ear- 
liest cormorant and gull models were less satisfac- 
tory, especially in the case of cormorants. The omis- 
sion error rate for the 1944 cormorant model was 
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between 40 and 50 percent, when tested on data 
from 1977 and 1997. The difference in predictive 
ability of the eider model versus the cormorant and 
gull models could be related to the ability of these 
species to exploit varied habitats and food resources. 
The more stenotypic eider nests almost exclusively in 
dense, low herbaceous and shrub cover and relies on 
motile or sessile food resources, such as urchins, 
mussels, and snails (Choate 1967; Cantin et al. 
1974). Cormorants and gulls, on the other hand, are 
able to adapt to different nesting conditions as well 
as to changing food-resource levels and locations. 
Cormorants are opportunistic feeders that may read- 
ily switch between prey when a particular food item 
becomes scarce (Mendall 1936; Ross 1974; Black- 
well et al. 1995). The diet of great black-backed gulls 
is extremely flexible, including fish and shellfish, 
seabirds and their eggs, other vertebrates, insects, 
and refuse (Buckley 1990). Thus, as seabird popula- 
tions expanded and available resource levels shifted 
along the Maine coast from the early 1900s to the 
1970s and 1990s, it is likely that cormorants and 
gulls were better able to exploit a wider variety of 
habitats and food resources. 

Populations of each species increased greatly from 
the mid-1940s to 1977 (Korschgen 1979; Johnson and 
Krohn 1998). The 1977 models all had a correspon- 
ding increase in complexity, incorporating additional 
variables for a best-fit model. Although there was little 
change in population size or the number of occupied 
islands in any of the species considered between 1977 
and 1995, the explanatory variables included in the 
models did change over this time period. These 
changes could simply reflect a difference in the ran- 
dom selection of unoccupied islands (since each only 
included about half the available pool of islands). 
However, a review of the islands occupied in 1977 and 
the 1990s indicates a change in occupied islands as 
well, at least for cormorants and gulls. Although most 
seabirds exhibit strong site fidelity (Cairns 1992), it is 
not uncommon for entire colonies of some species to 
shift breeding locations (islands) between years (Men- 
dall 1936; Buckley and Downer 1992). In the case of 
double-crested cormorants, only 60 percent of the is- 
lands in the study area used for nesting at some time 


from 1994 to 1997 were occupied during all four 
years (Johnson 1998). 

Despite the change in model composition, however, 
the ability of 1977 models to predict island occupancy 
seven to twenty years in the future was good for all 
species (84—85 percent; Table 13.3). The robustness of 
the 1977 models in predicting future island occupancy 
could be indicative of habitat saturation by all three 
species in the area. Since an increasing number of 
habitats are used as populations increase (Svardson 
1949; O'Connor 1987b; Rosenzweig 1991), species at 
saturated densities could be expected to be occupying 
most usable habitat types in an area. Thus, if an island 
occupancy model is developed using data from a pop- 
ulation at a very high density, it should, theoretically, 
be able to identify all potentially usable habitat. Popu- 
lations of both double-crested cormorants and com- 
mon eiders were considered to be near saturated den- 
sities for the region in the late 1970s and early 1980s 
(Mendall 1976; Krohn et al. 1992; Krohn et al. 1995; 
Johnson and Krohn 1998). 

We did not explicitly test for spatial autocorrelation 
in our logistic models because of the clumped spatial 
pattern of habitat (i.e., islands) in the coastal system 
studied. Although several methods are available to es- 
timate the degree of spatial autocorrelation that often 
is exhibited by ecological data (Sokal and Oden 1978; 
Cliff and Ord 1981; Legendre 1993), most of these 
methods (e.g., Moran's I, Mantel test, semivariogram) 
were designed primarily for data collected from a reg- 
ular lattice or measured on a continuous scale (Cressie 
1993; Legendre and Legendre 1998). Little attention 
has been given to developing similar techniques for bi- 
nary data or data collected on an irregular lattice, 
such as that presented in this chapter (Cressie 1993; 
but see Fingleton 1983). At this time, we are not 
aware of any software package that includes methods 
for directly estimating the effect of spatial autocorrela- 
tion for inclusion in our logistic regression models 
(but see Smith 1994 and Klute et al., Chapter 27). 
Since we did not explicitly account for potential spa- 
tial dependence in our model, the results of our signif- 
icance test could be somewhat biased (i.e., the Type I 
error rate could be greater than the declared alpha 
level); however, such effects are unlikely to change the 
direction of the observed relationships. 


Colony Size 


As mentioned previously, there was little difference in 
temporal habitat associations between the full survey 
colony size models for double-crested cormorants. 
However, interesting changes in habitat relationships 
emerged when colonies of different ages (i.e., islands 
that were initially colonized at different times since 
1900) were analyzed separately. In a crude sense, this 
allowed us to follow individual sites or colonies 
through time. As the colony aged, there was a shift in 
emphasis from aquatic and disturbance-related factors 
to physical island characteristics for explaining most 
of the variability in colony size, though aquatic vari- 
ables were still important. This distinction was lost 
when looking at models developed for all colonies 
surveyed during a given period (i.e., the full-survey 
models). 

The size of recently established colonies can be mis- 
leading and models developed during that time period 
may not be reflective of factors that will ultimately 
regulate colony size. Although many of the islands 
along the Maine coast exhibited a rapid increase in the 
number of nesting cormorants after initial coloniza- 
tion, over a long. period (e.g., twenty to fifty years), 
colony size generally decreased to a relatively consis- 
tent, much lower level (Johnson 1998). Krohn et al. 
(1992) noted a similar overall decline in the number of 
nesting pairs of eiders per island and suggested that a 
steady decline could be indicative of a stabilizing pop- 
ulation. The importance of considering colony age in 
interpreting models based on colony size provides fur- 
ther support for the need for caution against using 
density alone as an indicator of habitat quality (Van 
Horne 1983; Vickery et al. 1992). 


Management Recommendations 


The results of this study indicate that seabird occu- 
pancy models may be good predictors of future island 
colonization if species either are habitat specialists or 
are at or near saturation density in a region. However, 
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caution should be exercised when using the results of 
models developed for less-specialized species to pre- 
dict future island use, especially when at low densities 
in a region. The use of colony size as an indicator of 
habitat quality also should be considered cautiously, 
since the size of a recently established colony often is 
misleading. In addition, the location of occupied ver- 
sus unoccupied, and high- versus low-quality sites, 
may shift over long time periods as the result of broad 
habitat changes resulting from environmental or 
human-induced changes (Helle and Jarvinen 1986; 
Greco et al., Chapter 14). Interspecific interactions 
(e.g., predation) also can result in displacement of 
seabird colonies from formerly high-quality sites to 
new areas (Buckley and Buckley 1984; Moors and 
Atkinson 1984). Coastal environments are especially 
prone to such fluctuations as a result of both environ- 
mental stochasticity and anthropogenic influences. 
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Geographic Modeling of Temporal Variability 
in Habitat Quality of the Yellow-billed 
Cuckoo on the Sacramento River, 

Miles 196—219, California 


Steven E. Greco, Richard E. Plant, and Reginald H. Barrett 


he western subspecies of the yellow-billed cuckoo 

(Coccyzus americanus occidentalis) is a riparian- 
obligate forest-interior species associated with large 
blocks (more than 41 hectares) of riparian forest vege- 
tation (Laymon and Halterman 1987a; Laymon and 
Halterman 1989; Laymon et al. 1997) within the 
major alluvial meandering system of the Sacramento 
River. Because of the precipitous decline in the yellow- 
billed cuckoo’s population and the scarcity of riparian 
habitat resources remaining in California (Bay Insti- 
tute 1998), the yellow-billed cuckoo is presently a 
state-listed endangered species (Steinhart 1990). 

The portion of the riparian ecosystem occupied by 
the yellow-billed cuckoo is a highly dynamic zone of 
the floodplain where forest structure and floristic com- 
position can change significantly in less than a decade. 
The changes result in a shifting mosaic controlled by 
three major agents on the Sacramento River: (1) natu- 
ral river channel migration and floodplain dynamics, 
such as erosion, deposition, scour, and flooding (phys- 
ical processes); (2) succession and high growth rates of 
riparian vegetation (biological processes); and (3) cul- 
tural modification of the landscape for flood control 
and agricultural land development (e.g., regulated 
flows from dams, levee construction, and land clearing 
for crops). 

Based on these three agents of change the main hy- 
pothesis for this investigation was: As a consequence 


of the extensive, rapid changes to the structure of the 
riparian landscape, the suitability or quality of habitat 
with respect to its potential value for nesting and for- 
aging for the yellow-billed cuckoo can change rapidly 
through time, forming a shifting mosaic of habitat 
patches that vary in area. The hypothesis was tested 
by modeling the habitat relationships of the yellow- 
billed cuckoo within a 23-river-mile study reach on 
the Sacramento River using spatial data from 1997 
and then applying the habitat model to five historical 
land-cover data sets. The historical land-cover data, 
spanning the years 1938 to 1987, is used for compar- 
ative purposes to monitor the probable historical de- 
velopment (formation and extinction) of suitable habi- 
tat patches of the yellow-billed cuckoo through time. 
Knowledge of habitat patch dynamics can help man- 
agers develop more effective recovery plans for the 
yellow-billed cuckoo on the Sacramento River. 


Background 


The western subspecies of the yellow-billed cuckoo is 
a Neotropical migrant that winters in South and Cen- 
tral America and breeds in western North American 
riparian forests and floodplains during the summer 
months (Gaines 1970; Halterman 1991; Ehrlich et al. 
1992). The yellow-billed cuckoo once bred abun- 
dantly throughout the western Pacific states, but today 
the northern range is limited to the middle reaches of 
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the Sacramento River (Laymon and Halterman 1987a; 
Zeiner et al. 1990). The decline in abundance of the 
yellow-billed cuckoo is mainly attributed to conver- 
sion of essential riparian floodplain habitats to agri- 
cultural and flood control land uses and also may have 
been augmented by the widespread application of pes- 
ticides (Gaines and Laymon 1984; Laymon and Hal- 
terman 1987a; Steinhart 1990). Additionally, the mi- 
gration pathways of the cuckoo have been fragmented 
over the past 150 years of settlement. 

Historically, the yellow-billed cuckoo was de- 
scribed as common in the Central Valley of California 
(Belding 1890); however, by the 1940s the popula- 
tion’s decline was noted by Grinnell and Miller 
(1944). Population censuses intermittently collected 
between 1972 and 1990 indicated the Sacramento 
River population of the yellow-billed cuckoo had de- 
clined from an estimated 96 pairs in 1973 (Gaines 
1974), to approximately 60 pairs in 1977 (Gaines 
and Laymon 1984), and a four-year survey by Halter- 
man (1991) found the total number of pairs between 
1987 and 1990 to vary between twenty-three and 
thirty-five pairs. The yellow-billed cuckoo was listed 
by the state of California as threatened in 1971 and 
then as endangered in 1988 (California Department 
of Fish and Game 1998). For a description of the sub- 
species status of the yellow-billed cuckoo, see Ridg- 
way (1887); Bent (1940); Laymon and Halterman 
(1987a); Banks (1988); Franzreb and Laymon (1993). 
The American Ornithologists’ Union listed the west- 
ern subspecies in its checklist from 1895 to 1957 
(AOWMS 9551910, 1931, 1957). 

Also known as the California cuckoo or the west- 
ern yellow-billed cuckoo (Ridgway 1887; Martin et 
al. 1951; Ehrlich et al. 1992), the bird is medium- 
sized, averaging 27-30 centimeters in length and 60 
grams in weight. The yellow-billed cuckoo arrives in 
California from South America between late June 
and early July and departs California with its young 
by mid-August to September. Past field studies of the 
yellow-billed cuckoo’s nesting and foraging habitat 
requirements (Gaines 1970; Gaines 1974; Laymon 
and Halterman 1987a; Halterman 1991; Laymon et 
al. 1997) show it is a riparian-obligate forest-interior 
species that has a home range between 17 and 41 
hectares ( about 40-100 acres) (Laymon 1980; Lay- 


mon and Halterman 1987a; Laymon and Halterman 
1989; Laymon et al. 1997). The preferred habitat is 
a mosaic of riparian forest vegetation consisting of 
willow (Salix spp.) and cottonwood (Populus fre- 
montii) forests in combination with open-water habi- 
tats such as an oxbow lake or backwater channel. 
Dense vegetation less than 20 meters in height is es- 
pecially important for nesting while both low and 
high vegetation are used for foraging (Laymon et al. 
1997). Preferred prey include green caterpillars, 
hornworms, katydids, tree frogs, grasshoppers, and 
cicadas. The average clutch is two to four eggs and 
the young are fledged at an age of one month. In 
years of high food abundance, double and triple 
brooding has been observed (Halterman 1991; Lay- 
mon et al. 1997). 


Wildlife Habitat Relationship Modeling 


It is widely recognized that models of species-specific 
habitat relationships and population demographics 
can be useful in gaining greater insight into develop- 
ment of a conservation strategy for recovery of a 
species of special management concern (USFWS 
1981a; USFWS 1981b; Verner et al. 1986b; National 
Research Council 1995; Bissonette 1997b; Noss 
et al. 1997). The objective of our study was to de- 
velop a habitat suitability model to predict presence 
or absence of yellow-billed cuckoo. This was done 
using a modified version of the California wildlife 
habitat relationships (CWHR) system land-cover 
classification scheme (Mayer and Laudenslayer 
1988) to model the habitat requirements of the yel- 
low-billed cuckoo. 

It is important to note that the predicted habitats of 
the CWHR system represent only potential habitat, 
since for example nonhabitat factors that affect abun- 
dance, such as competition, predation, disease and 
stochastic processes, are not considered (Airola 1988; 
Garrison 1993). For a discussion of gap analysis and 
application of species-distribution maps (e.g., using 
wildlife habitat relationship models) see Scott et al. 
(1987), Burley (1988), Davis et al. (1990), Lopez 
(1998), and Savitsky et al. (1998). For more informa- 
tion on wildlife habitat relationship modeling, see 
Verner et al. (1986b) and Morrison et al. (1992). 
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Materials and Methods 


Study reach description. Throughout this study, spe- 
cific locations and reaches of the river were spatially 
referenced using the river-mile marker system estab- 
lished by the U.S. Army Corps of Engineers (e.g., US- 
ACOE 1991). The study reach for this investigation 
was located between river-mile 196 at Pine Creek 
bend and river-mile 219, near Woodson Bridge State 
Recreation Area (Fig. 14.1a, b). This extent of the 
river was selected because (1) yellow-billed cuckoo 
surveys were available, (2) there were few levees to 
constrain flood flows, and thus geomorphic processes 
such as active channel migration and bend cutoff had 
occurred over the past thirty years resulting in exten- 
sive riparian forests, (3) historical aerial photographs 
were available, and (4) several publicly owned lands 
were accessible for research purposes (Fig. 14.1b). The 
publicly accessible areas include the Pine Creek 
Wildlife Area and Wilson Landing (both owned and 
managed by the California Department of Fish and 
Game) as well as the River Vista Unit at Merrills 
Landing, and Foster Island (owned and managed by 
the U.S. Fish and Wildlife Service). 

The six available surveys for presence of the yel- 
low-billed cuckoos in the study reach were dated in- 
termittently from 1972 to 1990 (Gaines 1974; Gaines 
and Laymon 1984; Laymon and Halterman 1987b; 
Halterman 1991). All of these surveys had found the 
Pine Creek Wildlife Area, located at the southern 
end of the study reach, to consistently be one of the 
highest-density breeding areas of the yellow-billed 
cuckoo on the river. 

Many historical aerial photographs of the Sacra- 
mento River were available in stereo-pair format for 
the years spanning 1937 to 1997. The six time periods 
selected and mapped for this study were 1938, 1952, 
1966, 1978, 1987, and 1997. These dates were chosen 
to assess landscape changes over approximately 
decades (the mean paired age difference was 11.8 
years). 

Land-cover mapping methods. For this study a 
modified version of the CWHR land-cover classifica- 
tion system was used. Three primary habitat variables 
were interpreted within the extent of the study reach: 
(1) land-cover type, including riparian, valley oak 


woodland, annual grassland, freshwater emergent 
wetland, riverine, lacustrine, gravel bar, orchard, crop- 
land, pasture, and urban/developed; (2) tree size in 
three height categories: low (less than 6 meters), 
medium (6-20 meters), and high (greater than 20 me- 
ters); and (3) canopy cover, in four classes: sparse 
(10-24 percent), open (25-39 percent), moderate 
(40-59 percent), and dense (60-100 percent). For 
comparison (a category crosswalk) of the land-cover 
types used to map riparian vegetation in this study 
and the equivalent CWHR system land-cover types, 
tree size classes and canopy cover classes, see Greco 
(1999). 

The modified mapping system was tested using 
1997 aerial photography within the extent of the 
study boundaries defined by the flight lines as depicted 
in Fig. 14.1b. The land-cover and woody vegetation 
structural classes were delineated and field verified in 
1997 before applying the mapping methods to the his- 
torical photographic sets. Each aerial photographic 
image was manually interpreted under a stereoscope 
by marking geographic control points and stratifying 
the riparian landscape into homogeneous land-cover 
variables delineated as polygons. The delineation 
process employed a grain size, or minimum mapping 
unit, of 20 meters. Forest stand height classes were vi- 
sually estimated for each polygon using the stereo- 
scope and canopy cover was estimated using a stan- 
dard photo interpretation tree stocking density scale 
(Aldrich et al. 1984). 

The interpreted mapping was then converted to 
digital format and imported into the geographic infor- 
mation system software application (GIS), ArcInfo 
Version 7.1 (ESRI 1997) and converted to polygon 
coverages. The polygon coverages were then attrib- 
uted and transformed to geographic coordinates using 
control points derived from corresponding (matched) 
coordinate pair locations from each historical photo- 
graphic set to a 1978 (black and white) orthophoto- 
graphic set of quadrangle maps from the U.S. Geolog- 
ical Survey at a mapping scale of 1:24,000. 

Validation of land-cover mapping. The 1997 land- 
cover mapping was field verified in late summer 1997 
at four sampling locations that were publicly accessi- 
ble from either land or water. Only the riparian land- 
cover types were selected for field verification; the 
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Figure 14.1. (a) Study reach location within the Sacramento Valley, (b) the study reach public lands, river-mile 
markers, approximate locations of yellow-billed cuckoo (Coccyzus americanus) (YBCU) detection surveys, and 
the extent of the analysis boundary common to each temporal data set from river-miles 196-219 (grid tics in 
UTM, Zone 10). 
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agricultural land-cover types were not verified. A com- 
plete list of polygons was generated and sorted by lo- 
cation, height class, and canopy cover class. Sample 

size was determined for the “stand-height” variable 
using Stein's two-stage method (Steele and Torrie 
1960). From this sample-size assessment, a stratified 
random set of sixteen polygons was selected from the 
polygon list using a random number table. 

The height of each sample forest-stand polygon was 
measured from an observation point using a survey 
laser instrument (Criterion model 400; Laser Technol- 
ogy 1996) at a mean distance of approximately 50 
meters. At each forest stand, twenty measurements 
were collected from restricted random compass bear- 
ings using a random number table. The forest stand 
canopy cover was measured using a vertical densito- 
meter (sighting tube) (Geographic Resource Solutions 
1997), by taking one hundred measurements over a 
randomly positioned 100-meter transect with perpen- 
dicular cross lengths each 20 meters long (Ganey and 
Block 1994). See Greco (1999) for more information 
regarding the field methods and results. 

Based on positive results from field verification of 
the 1997 land-cover mapping of the study reach, the 
same mapping process was then applied to the histori- 
cal photographs from 1938, 1952, 1966, 1978, and 
1987. An error check was performed on each mapping 
year's final land-cover data set to verify for database 
attribute accuracy and polygon boundary errors and 
omissions. 


Habitat Modeling of the 
Yellow-billed Cuckoo 


The geographic habitat model developed in this inves- 
tigation to assess the quality of yellow-billed cuckoo 
habitat on the Sacramento River was implemented 
using ArcView GIS version 3.1 (ESRI 1996a), Spatial 
Analyst extension version 2.0 (ESRI 1996b), and the 
native scripting language Avenue (ESRI 1996c). For a 
description of the GIS procedures and Avenue code 
developed to implement the model, see Greco (1999). 

The purpose of the model was to identify potential 
suitable or optimal habitat patches within the extent 
of the study reach (river-miles 196 to 219) by assess- 
ing attributes of the land-cover data based on the 


known nesting, feeding, and cover habitat require- 
ments of the yellow-billed cuckoo. The model gener- 
ates a habitat index score (scaled from 0 to 1) for a 
particular patch that reflects the relative strength or 
magnitude of the suitability of the patch to support a 
breeding pair of yellow-billed cuckoos. The model 
does not predict carrying capacity (population density 
per unit area). A “patch” in this investigation was de- 
fined as the geographic area of the following contigu- 
ous land-cover categories: riparian, freshwater emer- 
gent wetland, or lacustrine. A patch is also referred 
to as a forest stand, polygon, zone, or *zonal area" 
in the following descriptions involving spatial data 
processing. 

HSI model parameters. A habitat suitability index 
(HSI) model for the yellow-billed cuckoo was devel- 
oped from a wildlife habitat relationship model pub- 
lished by Laymon and Halterman (1989) in combina- 
tion with an additional forest structure variable 
derived from Laymon (1980) and Laymon et al. 
(1997) (Table 14.1). The HSI model variables used for 
this study were (1) patch area, (2) patch width, (3) 
patch distance-to-water, (4) within-patch ratio of high 
vegetation to low and medium vegetation, and (5) 
within-patch ratio of young floodplain to old flood- 
plain. Two versions of the HSI model were tested; a 
“reduced model" that used variables 1-3 (listed previ- 
ously) and a *full model" that used all five variables. 
Each of the yellow-billed cuckoo HSI variables is de- 
picted in graphical format in Fig. 14.2 (a-e). The habi- 
tat variables identified by Laymon and Halterman 
(1989) were patch vegetation species, patch area, 
patch width, and patch distance-to-water (see Table 
14.1). 

The variables “patch area” and “patch width” 
were measured using the Spatial Analyst (ESRI 1996b) 
zonal functions. The patch “distance-to-water” vari- 
able was measured by generating a distance grid from 
lacustrine and riverine land-cover types within the 
study reach. The variable “patch vegetation species” 
was specified by Laymon and Halterman (1989) as 
*willow-cottonwood," which was a subset of the 
mapped land-cover category “riparian” in the 1997 
land-cover mapping. To identify the species associa- 
tion within riparian forest patches the surrogate vari- 
able “floodplain age” was used as an indicator for all 
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TABLE 14.1. 


Wildlife habitat relationships model variables for the western yellow-billed cuckoo (Coccyzus americanus) on the Sacramento River, 
California (adapted and modified from Laymon and Halterman 1989; Laymon et al. 1997). 


Patch Zonal area sum Zonal area sum ratio of 
Habitat Patch Patch distance-to-ratio of riparian floodplain age (yrs.) 
suitability Land cover / area width water tree height (young:old) 
(index score) vegetation type (ha) (m) (m) classes (H:L+M) (<60:>60) 
Oniimum FIBSNGUT >80 >600 <100 0.8-1.249 2.1-4.0 
(1.0) Willow-Cottonwood 
Suitable Riparian/ 41-80 200-600 «100 0.25-0.799/ 1.1-2.0 
(0.66) Willow-Cottonwood 1.25-2.0 4.1-7.0 
Marginal Riparian/ 17-40 100-199 «100 «0.249 0.6-1.0 
(0.33) Willow-Cottonwood zx ab 
Unsuitable Riparian/ <17 “00 — ai «0.5 
(0.0) Willow-Cottonwood 


willow-cottonwood associations (see Greco 1999). A 
zonal sum ratio of young floodplain to old within each 
patch using the floodplain age variable was employed 
as a means to distinguish between those patches most 
likely to contain the willow-cottonwood association 
from those containing predominantly older upper- 
floodplain tree species such as valley oak (Quercus lo- 
bata) and California black walnut (Juglans bindsii). 
The zonal sum ratio was defined as the sum total area 
of all of the cells of young floodplain (less than sixty 
years) divided by the sum total area of all of the cells 
of old floodplain (more than sixty years) within a 
patch. To calibrate this variable an estimated monoto- 
nic distribution was applied (see Fig. 14.2e). A patch 
analysis of the Pine Creek Wildlife Area suggested a 
zonal area sum ratio of 3:1 (young floodplain to old 
floodplain) is an optimal proportion for the yellow- 
billed cuckoo. 

The forest structural composition variable added to 
the Laymon and Halterman (1989) model is based on 
two yellow-billed cuckoo nesting behavior and habitat 
structure studies from research conducted on the 
Sacramento River by Halterman (1991) and the Kern 
River Preserve by Laymon et al. (1997). These authors 
reported that nesting was most frequent in vegetation 
below 20 meters in height. Therefore, the model vari- 
able specifies that each patch must posses some frac- 
tion in the height size class low or medium. To cali- 


brate the forest structure variable, a home range 
analysis of four breeding pairs in 1979 at the Pine 
Creek Wildlife Area (Laymon 1980) was used to de- 
rive optimal (birds breeding) and unsuitable (no birds) 
proportions of vegetation height. The analysis for tree 
height proportions used the 1978 land-cover data set 
and suggested a zonal area sum ratio of 1:1 (high to 
medium plus low vegetation) was an optimal propor- 
tion and an upper limit of 2:1 was estimated to be un- 
suitable. A monotonic distribution was approximated 
for the calibration of this relationship (Fig. 14.2d). 
The assumption behind the calibration of this variable 
was that forest structural characteristics are a control- 
ling variable influencing the cuckoo's presence in these 
forest patches. However, other habitat factors may be 
limiting their presence. Habitat variables were com- 
bined to generate a score using a geometric mean 
equation with each variable receiving equal weight. 
The geometric mean approach was selected because it 
limits habitat suitability for a particular patch to those 
patches that contain all the necessary attributes for 
suitable habitat (Cooperrider 1986). Implementation 
of the model was accomplished with grid-based pro- 
cessing (at 0.04-hectare cell size) to perform the digital 
cartographic overlay mapping of the land-cover habi- 
tat variables. 

HSI model validation. 'The HSI model output for 
the study reach was validated using the pooled 
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Figure 14.2. (a-e) The five habitat suitability index Variables 
used to identify habitat for the yellow-billed cuckoo (Coccyzus 
americanus) on the Sacramento River. 


observations from previous census surveys of the yel- 
low-billed cuckoo (Gaines 1974; Gaines and Laymon 
1984; Laymon and Halterman 1987b; Halterman 
1991) plus a survey conducted by this investigation in 
July and August 1998 for a total of eleven survey 
years between 1972 and 1998. The pooled observa- 
tions were grouped into twenty-one common observa- 
tion locations within the study reach (Fig. 14.1b). The 
twenty-one observation locations were then converted 
to presence-absence values and compared to the out- 
put from the HSI model. The results of the model 
were tested by performing two contingency table 
analyses using a chi-square test statistic (a Likelibood 
Ratio, SAS Institute 1994) and Cohen's Kappa (K) 
(Cohen 1960). Kappa is related to chi-square but it 
de-emphasizes the omission and commission error 


terms by placing more weight on the results the model 
predicts correctly. 

Yellow-billed cuckoo surveys were conducted by 
this investigation over land in late July 1998 and by 
boat within the main channel and backwaters of the 
Sacramento River in early August 1998. The surveys 
were conducted within the study reach south of the 
Woodson Bridge State Recreation Area to Pine Creek 
Wildlife Area at river-mile 196. These surveys fol- 
lowed the sampling protocol described by Laymon et 
al. (1997). 

HSI modeling with the historical land-cover data 
sets. The final stage in this investigation was the appli- 
cation of the geographic HSI model to the five histori- 
cal land-cover data sets. This was accomplished by 
adapting the HSI Avenue script developed for the 
1997 land-cover mapping to the data sets from 1938, 
1952, 1966, 1978, and 1987. 


Results 


Land-cover Mapping and Validation 


The results from the 1997 land-cover mapping (Fig. 
14.3) and the field verification study indicated that 
87.5 percent of the polygons were correctly classified 
for the variables “stand height” and “vegetation 
canopy cover” (Tables 14.2 and 14.3), which is 
viewed as being within an acceptable target range for 


land-cover mapping accuracy (Morgan and Savitsky 
1998:171). The forest stands in the tall (high) height 


TABLE 14.2. 


Error matrix for polygon forest stand height classification. 


Photo-interpreted 
stand height class 


Total 
Low Medium .. High N 


Field-measured 


stand height class Low 6 0 6) 6 
Medium 2 4 0 6 

High (0) O 4 4 

Total NV 8 4 4 16 


Note: Proportion correct (accuracy): 14/16 x 100 = 87.5% 
Errors denoted in gray box. 
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TABLE 14.3. 


Error matrix for polygon canopy cover classification. 


Photo-interpreted 
canopy cover class 
Total 
Sparse Open Medium Dense N 
Field-measured Sparse 2 0 [6 (0) 2 
canopy cover Open 1 1 (0) (0) 2 
class Medium O di 3 0 4 
Dense 0 0 0 8 8 
Total N 3 2 3 8 16 


Note: Proportion correct (accuracy): 14/16 x 100 = 87.5% 
Errors denoted in gray boxes. 


class were interpreted with greater accuracy and preci- 
sion than the medium or low height classes due to the 
greater variability in stand heights in the low and 
medium stands (Table 14.4). 


Habitat Modeling of the Yellow-billed Cuckoo 


HSI modeling results. The output of two versions of 
the HSI model applied to the 1997 land-cover data set 
are shown in Figure 14.4 (see color section). The two 
versions were a reduced model using three variables, 
which did not provide satisfactory results, and a full 
model using all five variables (Greco 1999). It was ev- 
ident that the full model was more discriminating in 
predicting suitable and optimal patches than the 
reduced model. No patches scored in the marginal 
category. 

Validation of HSI modeling results. Five yellow- 
billed cuckoos were detected at two of the thirteen 


TABLE 14.4. 


sample locations in 1998. The two detection locations 
were Pine Creek Wildlife Area (PCWA) at river-mile 
197 and the River Vista Unit at Merrills Landing near 
river-mile 214. Based on the behavior of the detected 
birds, two of the four cuckoos detected at the PCWA, 
were presumed to be part of a mated pair, and the 
other two were presumed to be unmated males. Only 
one cuckoo was detected at the River Vista Unit and 
was presumed to be an unmated male. 

A pooling of all the yellow-billed cuckoo census data 
for twenty-one patch locations for presence or absence 
is presented in Table 14.5 along with a comparison of 
the resultant scores generated by each HSI model for 
each location. The overall accuracy (the proportion of 
locations correctly predicted) of the full model was 81 
percent and for the reduced model 48 percent. A pres- 
ence-absence error matrix was generated for each re- 
spective HSI mode! (Tables 14.6 and 14.7). A chi- 
square test (Likelihood Ratio, SAS Institute 1994) 
applied to the results of the HSI models showed the full 
model, which used all five variables, was highly signifi- 
cant (G2 = 12.63, df = 1, P = 0.0004), whereas the re- 
sults of the reduced model were only marginally signifi- 
cant (G2 = 3.18, df = 1, P = 0.0746). The Kappa (K) 
statistical test, which is a measure of agreement be- 
tween the observed and predicted data, indicated that 
the full model was moderately significant (K = 0.63, se 
= 0.154); however, the reduced model was not signifi- 
cant (K = 0.19, se = 0.108) (see Fielding, Chapter 21, 
for further discussion of the Kappa statistic). 

From the standpoint of wildlife habitat relationship 
modeling, an omission error can have a greater 


Summary of forest stand polygon field sampling results. 


Stein’s 
Stand Stand Stand Stand two-stage Polygons 
height Stand height height height Precision Significance sample field 
class height range mean sd (%) level size sampled 
code class (m) (m) (m) (d) (a) (N) (N) 
H High >20 22.7 1.6 10 0.05 3 4 
M Medium 6-20 127 9 20 0.10 4 4 
IL Low <6 5.8 1.9 20 0.10 6 8 
L+M Low and 
Medium <20 8.1 Amt 20 0.10 12 12 


combined 


(b) Vegetation Height Classes 
EE High >20 m (>65') 
Med 6-20 m (20-65') 
E= Low «6 m (<20") 


` Land Cover Types 

=) Valley riparian 

Annual grassland 

Valley oak woodland 

[7:73 Gravel bar 

|_| Riverine, Lacustrine 

| | Orchard, Cropland, Developed 


] 0 1 2 Kilometers 
-— —— 


1 


0 1 Miles 


(a) Canopy Cover Classes 
EN Dense cover (60-100%) 
EE Medium cover (40-59%) 

Open cover (25-39%) 

| Sparse cover (10-24%) 

None or very sparse (0-9%) 


Figure 14.3. Results from the 1997 aerial photograph land-cover interpretation of the variables (a) canopy cover, and 
(b) vegetation height class and land-cover habitat type. Habitat types are depicted in various patterns while the size 
classes (forest stand height) are depicted in various hatching patterns. 
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TABLE 14.5. 


Summary of pooled yellow-billed cuckoo (Coccyzus americanus) field observations and habitat suitability index (HSI) model 


predictions within the study reach. 


Reduced model Full model 
Patch YBCU HSI HSI 
location River Year birds obs. HSI prediction HSI prediction 

number mile observed? (total no.) score (X) score (+ 

al 219.0 1972, 1977, 1987 16 0.87 (*) 0.67 (*) 

2 218.5 1972 0 0.47 (-) 0 (+) 

3 218.0 1972 0 0 (+) 0) (+) 

4 215:2 1972, 1977 2 0.47 (+) 0.54 (+) 

5 2915 1987, 1998 2 0.87 (+) or (+) 

6 212.2 1998 0 0.47 (-) 0.47 (-) 

T 21155 1998 0 0.75 (-) 0.77 (-) 

8 211.0 1987-1990, 1998 0 0.87 (-) 0) (+) 

9 210.0 1998 0 0.87 (-) 0 (*) 

10 209.8 1998 0 0.47 (-) 0 (+) 

aall 209.5 1972 4 0.60 (+) 0.54 (+) 

12 209.0 1998 0 0 (*) 0 (+) 

13 207.6 1998 0 0.47 (-) 0.59 (C) 

14 207.2 1987-1990 0 0.60 oO 0 (*) 

15 206.5 1998 0 0.60 (-) 0.54 (-) 

16 205.5 1987-1990, 1998 (0) (0), 7403) (-) (0) (+) 

acm 204.5 1987, 1989 8 0.87 (*) 0.84 (+) 

18 203.0 1987-1990 al 0.87 (+) 0.84 (+) 

19 201.5 1977 all 0.47 (+) 0.47 (+) 

20 199.0 1972 (0) (0) (*) ^ (+) 
21 197.0 1972, 1977 

1987-1990, 1998 135 0.87 (+) 0.92 (+) 

Note:YBCU = yellow-billed cuckoo; @ = see text for census study references; (+) = correctly predicted; (-) = incorrectly predicted. 


negative biological consequence than a commission 
modeling error (Morrison et al. 1992; Garrison 1993). 
For example, in this study, an omission error would be 
an instance where the HSI model failed to predict the 
presence of potential yellow-billed cuckoo habitat 


TABLE 14.6. 


where cuckoos have been observed in field surveys. A 
commission error, on the other hand, would be an in- 
stance where the model predicted the cuckoo's pres- 
ence in a patch where no yellow-billed cuckoos have 
been detected by field surveys. *Optimal or suitable" 


TABLE 14.7. 


Error matrix for the yellow-billed cuckoo (Coccyzus americanus) 
reduced habitat suitability index (HSI) model (three variables). 


Field observation 


N= 21 Present Absent 
HSI 

Reduced Present 8 10 
Modei 

Prediction Absent (0) 3 


Error matrix for the yellow-billed cuckoo (Coccyzus americanus) 
full habitat suitability index (HSI) model (five variables). 


Field observation 


N=21 Present Absent 
HSI 

Full Present 8 4 
Model 

Prediction Absent 0 9 
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Figure 14.6. Trends in the total area and quality of yellow-billed cuckoo (Coccyzus americanus) habitat from 1938 to 1997 on the 


Sacramento River, river-miles 196-219. 


habitats are not always 100 percent occupied (Van 
Horne 1983; Morrison et al. 1992) and may be occu- 
pied in one year and not in the next. Therefore, com- 
mission error is a more “preferable” type of ecological 
error than omission error (Garrison 1993). Both of the 
spatial HSI models in this study had no omission error 
whereas the commission error was 19 percent for the 
full model and 48 percent for the reduced model. 
Since the yellow-billed cuckoo is presently a relatively 
rare species (i.e., endangered) and a habitat specialist 
on the Sacramento River, the commission error rates 
may have been reduced with larger sample sizes for 
detection (i.e., a “rarity effect;” see Karl et al., Chap- 
tensili 

Results of HSI model application to historical land- 
cover data sets. The final step in this investigation was 
to apply the full HSI model to the historical land- 
cover data sets interpreted from 1938, 1952, 1966, 
1978, and 1987 (see Fig. 14.5a-e in color section). 
The trend in the quantity of optimal habitat appears 
to have decreased from 1938 to 1966 and increased 
from 1966 to 1997 (Fig. 14.6). The spatial distribu- 
tion of suitable and optimal habitat for the yellow- 
billed cuckoo shifted considerably from 1938 to 1997 
(Fig. 14.5a-f). Some general patterns of the rates of 
habitat patch formation and extinction are evident. 

Perhaps the most revealing example of habitat for- 


mation is the Pine Creek Wildlife Area (PCWA) at 
river-mile 196 (at the southern extreme of the study 
reach). This location has had consistently high occu- 
pancy by yellow-billed cuckoos since 1977. In the ex- 
tensive census work done by Halterman (1991) on the 
Sacramento River (from Red Bluff to Colusa), the 
PCWA location had the highest occupancy (total num- 
ber of birds) of yellow-billed cuckoos as compared to 
any other sampled location on the river during the 
1987-1990 time period. Results of the modeling indi- 
cate that the riparian forests at the Pine Creek bend 
began to form during the late 1960s and steadily pro- 
gressed into the 1970s to become optimal habitat in 
1978 (Fig. 14.5d). Prior to the 1970s (Fig. 14.5a-c), 
however, optimal habitat patches were apparently lo- 
cated 9—20 kilometers to the north of Pine Creek bend 
near the Glen-Colusa Irrigation District (GCID) canal 
pumping facility and Snaden Island (near river-mile 
206). 

The formation of riparian forests and oxbow lakes 
at the River Vista Unit (Merrills Landing at river-mile 
214) between 1978 and 1987 suggests that the opti- 
mal conditions (as defined in the WHR model in Table 
14.1) for yellow-billed cuckoo habitat formation can 
occur within a nine-year time span. Some locations 
may require longer time; for example, the Pine Creek 
Wildlife Area took twelve years to form. In terms of 
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persistence, the southern tip of the riparian forests at 
Kopta Slough (near river-mile 219) appears to have 
lasted as suitable habitat for thirty-one years while the 
location near GCID at river-mile 205.5 evidently 
lasted twenty-eight years (1938-1966). Between 1938 
and 1997, the mean percent of size (height) classes 
found not to change between time periods was 17 per- 
cent for low vegetation, 33 percent for medium-sized 
vegetation, and 62. percent for tall vegetation (more 
than 20 meters). Maintenance of low- and medium- 
sized forests appears to be primarily a function of sub- 
strates made available from channel migration and 
meander bend cutoff processes (Greco 1999). 

Several extinction patterns are also evident from 
the modeling results, though we can only speculate as 
to the cause due to the extensive forest clearing con- 
ducted by farmers throughout the study reach. The 
largest patch to lose habitat value for the yellow-billed 
cuckoo was located at river-mile 215 (right bank) de- 
picted in 1938 (Fig. 14.52). This result from the model 
suggests that habitat values can be altered in less than 
fourteen years from what appears to have been a river- 
meander-induced change (from optimal to unsuitable 
habitat conditions). Hence, the erosive forces of the 
channel eliminated essential habitat conditions, in- 
cluding low willow (Salix spp.) forests and a backwa- 
ter channel. The GCID location lost its suitable yel- 
low-billed cuckoo habitat between 1966 and 1978. 
Examination of the GCID location suggests that vege- 
tation growth rates and shifts in vegetation species 
composition (i.e., succession) were responsible for the 
change in habitat value within a period of less than 
twelve years. 


Discussion 


The concepts and theories of shifting mosaics within 
the patch dynamics approach to studying heteroge- 
neous landscapes (i.e. Bormann and Likens 1979; 
Pickett and White 1985; Malanson 1993; Pickett and 
Rogers 1997) provide a framework within which to 
evaluate yellow-billed cuckoo core habitat dynamics. 
The concept of a *minimum dynamic area" as de- 
scribed by Pickett and Thompson (1978) is an ap- 
proach applicable for anticipating changes to the shift- 
ing habitat mosaic of an endangered species. The 


proposed Sacramento River Conservation Zone 
(SRCZ) as detailed in the Sacramento River Conserva- 
tion Area Handbook (California Resources Agency 
1998) is a good starting point for defining this type of 
a management strategy. 

Pickett and Rogers (1997) argued that the shifting 
mosaic approach is robust enough to unify population 
ecology with community and landscape ecology be- 
cause spatial gradients can have great influence on 
metapopulation dynamics and ecosystem functions. 
The spatial arrangement of habitat patches and 
metapopulation distributions on the landscape are im- 
portant factors for estimating the potential value of a 
landscape to a species of concern (Hanski and Gilpin 
1991; Harrison 1991; Harrison and Fahrig 1995). 
Predicting how a mosaic of suitable habitat will 
change with respect to time (Dunn et al. 1991) is a se- 
rious management challenge when habitat resources 
are limited, as is the case on the Sacramento River. 

Understanding the turnover rates of habitat fea- 
tures and the sustainability of suitable habitat through 
time (habitat demography) is clearly critical to long- 
range recovery planning of endangered species. This 
investigation shows there has been replacement and 
an increasing trend of suitable and optimal habitat for 
the yellow-billed cuckoo since 1978 within the extent 
of the study reach. This replacement and enhancement 
of habitat was enabled by the processes of river mean- 
der dynamics that create complex geomorphological 
floodplain surfaces and oxbow lakes (through channel 
cutoff events) upon which riparian forests colonize 
and evolve (see Malanson 1993). Essential forest types 
for the yellow-billed cuckoo are low and medium- 
sized willow-cottonwood associations (Halterman 
1991; Laymon et al. 1997). The low- and medium- 
height forest stands have the highest rates of turnover 
in the riparian landscape as shown by Greco (1999). 
Management should anticipate these shifts by periodi- 
cally reassessing reserve area functionality and desig- 
nating new reserve locations in areas undergoing geo- 
morphic changes. 


Conclusion 


The decline of the yellow-billed cuckoo in the western 
United States is a prime example of the cumulative ef- 
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fects of the widespread removal of floodplain habitats 
(Gaines 1977) and extensive alterations to the fluvial 
geomorphological processes that create and maintain 
habitat for endangered species (e.g., Scott et al. 1997). 
A critical point raised by Pickett and Rogers 
(1997:121-122) is that *sustainability must be judged 
in part on the maintenance of community function? 
and *[lJack of information on the function of patch 
mosaics is currently the largest limit to ecological 
knowledge needed to manage patch dynamics effec- 
tively." Our results support the validity of this state- 
ment in terms of management of riparian forested 
floodplains on the Sacramento River, especially for the 
management of endangered species such as the yellow- 
billed cuckoo. 

To effectively plan a conservation strategy for the 
yellow-billed cuckoo on the Sacramento River, the dy- 
namics of the riparian floodplain landscape must be 
taken into consideration. A goal of a recovery plan for 
the yellow-billed cuckoo along the whole river should 
emphasize the need to restore ecosystem processes 
that maintain the forest types that constitute suitable 
and optimal habitat for the cuckoo. A sustainable re- 
serve system for the cuckoo should anticipate shifts in 
the habitat mosaic. Restoration of the riparian ecosys- 
tem on the middle and lower reaches of the Sacra- 
mento River (above and below the study reach) should 
focus on maintenance of channel hydrodynamic 
processes, such as channel migration and bend cutoff, 
that give rise to complex riparian floodplain forest 
mosaics. Distance between levees should be widened 


in several locations both above and below the town 
of Colusa (near river-mile 145) to allow for more- 
extensive riparian forest and floodplain formation. As 
a long-term conservation strategy, anticipatory and 
adaptive management should identify currently active 
meander lands and soon-to-be-active lands (including 
acquisition of both sides of the river channel) as high 
priority reserve areas to allow for essential habitats to 
form in the future to maintain viable breeding popula- 
tions of yellow-billed cuckoos on the Sacramento 
River. 
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Effects of Spatial Scale on the 


Predictive Ability of Habitat Models 
for the Green Woodpecker in Switzerland 


Claudine Tobalske 


S influences every aspect of ecological research 
(Wiens 1989a; Levin 1992), and models relating 
the distribution of wildlife species to characteristics of 
their environment are no exception. Three different 
spatial scales can characterize wildlife habitat relation- 
ships (WHR) models: (1) the grain of the species’ dis- 
tribution data; (2) the grain of the habitat variables; 
and (3) the extent of the study area. Changes in any of 
these are likely to affect the predictive ability of mod- 
els. Increasing the grain size of the species’ distribution 
data may reveal patterns hidden by individual vari- 
ability, as with the American redstart (Setophaga ruti- 
cilla), a species that is more selective at the territory 
than at the nest-site level (Sodhi et al. 1999). Alterna- 
tively, it may lower the classification success of models 
developed for species that base habitat selection on 
microhabitat characteristics (such as the dusky fly- 
catcher, Empidonax oberbolseri; Kelly 1993). The 
value taken by habitat variables is also likely to be in- 
fluenced by the grain at which they are collected. For 
example, the characteristics of a landscape—the rela- 
tive proportion of land cover types and their arrange- 
ment in patches—are grain dependent: increasing the 
minimum mapping unit (MMU) of a raster map will 
affect landscape composition and configuration 
through the loss of certain cover types (Turner et al. 
1989a,b). These changes will, in turn, affect modeling 
output. Predictive accuracy of the model may increase, 


because the *noise" that obscured patterns is elimi- 
nated; or it may decrease, because small but important 
habitat patches associated with the species’ presence 
are gone. 

These issues are especially important when the 
modeler has little control over the scale of the data 
used to develop the model. The current proliferation 
of breeding-bird distribution atlases offers a wealth of 
distribution data that can be used to derive WHR 
models (Gates et al. 1994; Tobalske and Tobalske 
1999). These data, however, are traditionally pre- 
sented in a grid format at a single, fixed scale that may 
not be biologically meaningful to the species being 
modeled and may not maximize classification success. 
Because habitat selection may be a hierarchical 
process occurring at several scales (Hutto 1985), pat- 
terns of bird-habitat relationships are likely to vary 
with the scale of investigation (Wiens 1985; Wiens et 
al. 1987). Indeed, several studies have shown the value 
of viewing habitat selection at more than one scale 
(e.g., Virkkala 1991; Bergin 1992; VanderWerf 1993). 
If this is not possible, then the scale of analysis should 
at least be compatible with the goals of the study. For 
example, Fielding and Haworth (1995) elected to 
work with 1-square-kilometer atlas grid cells because 
this grain was appropriate for a wide-ranging study 
aimed at identifying suitable nesting habitat rather 
than specific nest sites. 

In this study, I use atlas distribution maps of the 
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green woodpecker (Picus viridis) from two Swiss 
breeding-bird atlases to assess the effect of changing 
the grain of distribution data (atlas cell size) and the 
grain of habitat variables (MMU) on model classifica- 
tion results. 


Methods 


The Orbe Valley and Geneva Canton are both situated 
in the western part of Switzerland, although a portion 
of the Orbe Valley extends into eastern France (Fig. 
15.1). Similar in extent, they present very different 
landscapes (see Table 15.1, see Fig. 15.2 in color sec- 
tion). The Orbe Valley is characteristic of high-elevation 
valleys of the Jura mountain range (PNRHJ 1988): the 
valley floor is open pasture surrounded by dense, unbro- 
ken mixed forests dominated by Norway spruce (Picea 
abies) and European beech (Fagus sylvatica). Urban de- 
velopment is minimal and scattered. Elevation ranges 
from 972 to 1,669 meters. Forestry, dairy farming, and 
tourism are the principal economic activities. By con- 
trast, Geneva Canton is a highly developed agricultural 
landscape dominated by crops and fields, with impor- 
tant urban and aquatic components (the city of Geneva 
and the Lake of Geneva). Forests, mostly deciduous, 
occur as small patches embedded in the agricultural ma- 
trix, and elevation ranges from 328 to 563 meters. 


Digital Database 

Both atlases used in this study present green wood- 
pecker breeding distribution data in the form of 1- 
square-kilometer (100-hectare) grid cells (Géroudet et 
al. 1983; Glayre and Magnenat 1984). Breeding green 


TABLE 15.1. 


woodpeckers were censused in 49 of the 273 cells 
(17.9 percent) of the Orbe Valley and in 203 of the 
306 cells (66.3 percent) of Geneva Canton. I used the 
geographic information system (GIS) software ArcInfo 
version 7.0.3. (ESRI 1995) on a Unix workstation to 
digitally recreate the atlas grids. 

The models were built from a pool of seven vari- 
ables: five land-cover classes, mean elevation, and edge 
density. These variables were selected based on their 
potential importance to the green woodpecker (To- 
balske 1998) and because they were available for both 
study sites. Land cover was extracted for each atlas cell 
from a 1988 classified Landsat TM image with a 25- 
meter pixel resolution (0.0625-hectare MMU). The su- 
pervised classification was labeled by intensive 
groundtruthing of a topographically diverse 15-by- 
15-square-kilometer area, and by comparison with ex- 
isting, fine-scale land-cover maps (Vuillod 1994), and 
resulted in five land-cover classes: oak-hornbeam- 
beech (Decid), pure beech (Beech), beech-fir-spruce 
(Conif), conifer plantations (Planted), and not forested 
(Open). I computed edge density (Edgeden) between 
forest and nonforest patches for each cell, after group- 
ing the first four classes into one, and vectorizing the 
resulting file. I obtained mean elevation (Meanelev) for 
each cell by averaging elevation values from a 50-meter 
digital elevation model purchased from the French In- 
stitut Géographique National. 


Modeling Procedures 


I used multiple logistic regression (LR) to create mod- 
els to classify the presence and absence of the green 


Land cover composition of the Orbe Valley and Geneva Canton, Switzerland, as 
obtained from a classified Landsat TM satellite image. 


Orbe Valley Geneva Canton 
Land cover class Area (ha) Percent Area (ha) Percent 
Oak-hornbeam-beech forest 0.81 0.00 1039959 8.19 
Beech forest 3,409.19 10.59 1,994.69 5.39 
Beech-fir-spruce forest 16,071.44 49.94 594.44 1.59 
Conifer plantation 861.31 2.68 275.94 0.74 
Not forested 11,841.17 36.79 31,377.44 84.13 
Total 32,183.92 100.00 37,297.64 100.00 
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France 


Orbe Valley 


Switzerland 


Geneva Canton 


Figure 15.1. Location of the two study sites, Orbe Valley and Geneva Canton, in Switzerland. 


woodpecker. To prevent multicollinearity, I computed 
Pearson product-moment coefficient (r) between all 
pairs of variables and eliminated one variable from 
pairs with 7 greater than 0.7 (Green 1979). The deci- 
sion about which variable to eliminate was based on 
the results of univariate LR (log likelihood and Wald 
statistic; SPSS 1990). Parsimonious models were de- 
veloped from the remaining pool of variables using 
both forward and backward stepwise selection proce- 
dures. Stepwise procedures were employed because 
the pool of variables from which the models were 
built was already small and because these procedures 
provided an objective, repeatable approach to model 
building. When the two procedures resulted in differ- 
ent models, I retained the better of the two models 
based on log likelihood, Wald statistic of the predictor 
variables, and improvement of the model over chance 
classification as estimated from Cohen's Kappa (K) 
(Titus et al. 1984). Addition of variables in the for- 
ward procedure was based on the Wald statistic (P-of- 
entry = 0.05); removal of variables in the backward 
procedure was also based on the Wald statistic ( P-of- 


removal = 0.1). Because the output of LR is proba- 
bilistic, allocation of cases to predicted groups (pres- 
ence or absence) required that a cutoff be defined; I re- 
tained the midpoint between the mean probabilities 
for the presence and absence cells (Fielding and Ha- 
worth 1995). Even though this rule may not maximize 
Kappa, it was adopted because of its objectivity and 
consistency. 

To assess the influence of atlas cell size on classifi- 
cation results, I created new, scaled-up distribution 
maps by grouping four adjacent 100-hectare atlas 
cells. If at least one of the four cells was coded as pres- 
ence, the new, 400-hectare cell was coded as presence 
(Fig. 15.3). Because this coding depended on which 
cells were aggregated, four maps were created to cover 
all the possible allocations of 100-hectare cells, and a 
model developed with each. Aggregates that had only 
two or three cells (along the edge of the study areas) 
were dropped from analysis. The high proportion of 
presence cells in Geneva Canton resulted in only three 
aggregated cells being coded as absence (Fig. 15.3), so 
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/\/ 100-ha cells 
4N/ 400-ha cells 
@ Presence in 100-ha cells 


[^] Presence in 400-ha cells 
t Absence in 400-ha cells 


Figure 15.3. Effect of aggregating four cells of the census grid on the distribution of presences and absences for the green 


woodpecker (Picus viridis) in two study sites in Switzerland. 


large-cell models could only be built for the Orbe 
Valley. 

The influence of the grain of habitat variables on 
predictive ability was assessed for both study sites by 
resampling the classified Landsat image from a 
0.0625-hectare MMU to a 1-hectare MMU using a 
rule-based algorithm (Ma 1995). Previous manipula- 
tions showed that increasing the MMU from 1 hectare 
to 2 or 4 hectares resulted in little additional changes 
(Tobalske 1998). The LR modeling approach de- 
scribed above was used to derive the following mod- 
els: (1) 100-hectare cells, 0.0625-hectare MMU (both 
study sites); (2) 400-hectare cells, 0.0625-hectare 
MMU (Orbe Valley only); (3) 100-hectare cells, 
1-hectare MMU (both study sites); and (4) 400- 
hectare cells, 1-hectare MMU (Orbe Valley only). Be- 
cause high rates of presences and absences correctly 
classified can be obtained by chance alone (Morrison 
1969; Capen et al. 1986), I used improvement over 
chance classification (Kappa) to compare the models’ 
predictive accuracy. After comparing the performances 
of several confusion matrix-based measures, Fielding 
and Bell (1997) concluded that Kappa is one of the 


most suitable because it makes use of all available in- 
formation in the confusion matrix. 


Results 


Geographic location, atlas cell size, and MMU all in- 
fluenced the composition of the regression equations 
and the classification accuracy of the models predict- 
ing green woodpecker presence and absence (Table 
15.2). Elevation was the only consistent variable and 
entered all the models negatively. The main difference 
between models for the Orbe Valley and models for 
Geneva Canton was the sign reversal of the variable 
Conif. Classification accuracy, as measured by 
Cohen’s Kappa, varied from 0.22 (Orbe Valley, 
0.0625-hectare MMU, 100-hectare cell size) to 0.53 
(Orbe Valley, 1-hectare MMU, 400-hectare cell size). 
Increasing the MMU from 0.0625 hectare to 1 
hectare had a strong influence on edge density values 
(Table 15.3) and reversed the significance of Edgeden 
in univariate LR (from nonsignificant to significant in 
the Orbe Valley and from significant to nonsignificant 
in Geneva Canton, even using a P-value as high as 
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TABLE 15.2. 


Regression equations and classification results (percent presence and absence correctly classified, and Kappa value) for logistic 
regression models developed for the green woodpecker (Picus viridis) in two study sites in Switzerland. 


Site MMU (ha)? Cell size (ha)? Regression equation P (%) A (?6) Kappa 
Orbe Valley 0.0625 100 2.0957 — 0.0029Meanelev - 0.0284Conif + 
0.3865Planted 612 69.2 0.22 
400 0.39 — 0.0678Conif + 1.1775Planted 80.0 70.8 0.51 
1L 100 4.7992 — 0.0061Meanelev + 0.02Edgeden 
+ 1.0318Planted 64:2 T23 0.25 
400 8.9399 - 0.0094Meanelev + 0.0497Edgeden 77.8 75.0 0153 
Geneva Canton 0.0625 100 8.7277 — 0.0218Meanelev + 0.0158Edgeden 68.5 68.9 0.35 
al 100 12.0752 - 0.0291Meanelev + 0.292Conif 83.7 36.9 0.22 


eModels were created from a classified Landsat TM scene with two different minimum mapping units (MMU). 


bDistribution data were extracted for two different cell sizes. 


0.2). It did not affect the composition of the two land- 
scapes as much as it affected their configuration: mean 
patch size increased dramatically (Table 15.3) as small 
patches of less than 16 pixels (1 hectare) were elimi- 
nated. In Geneva Canton, the 1-hectare MMU model 
resulted in fewer misclassifications among the pres- 
ences (83.7 percent correctly classified; Table 15.2), 
but fewer absences were correctly classified than with 
the 0.0625-hectare MMU model (36.9 percent versus 
68.9 percent; Table 15.2), so overall model perform- 
ance as measured by Cohen's Kappa was lower at 
1-hectare MMU (0.223 versus 0.347; Table 15.2). In 
the Orbe Valley model, the variable Conif was re- 
placed by Edgeden at 1-hectare MMU, but this had 
little effect on classification results (K = 0.218 at 


TABLE 15.3. 


0.0625-hectare MMU versus K = 0.250 at 1-hectare 
MMU; Table 15.2). 

Spearman rank correlations, univariate LR results, 
and regression equations for the four models created 
in the Orbe Valley from the 400-hectare atlas cells 
were similar within MMUS, so I kept only the best 
model at each MMU for comparison. At 0.0625- 
hectare MMU, the presence/absence ratio changed 
from 0.22 (49/224) with 100-hectare cells, to 1.25 
(30/24) with 400-hectare cells. At 1-hectare MMU, 
this ratio changed from 0.22 to 0.84 (27/32). At both 
MMUS, increasing atlas cell size from 100 to 400 
hectares more than doubled Kappa (Table 15.2). This 
increase resulted from a better prediction of green 
woodpecker presences for the 400-hectare cell models 


Characteristics of two study sites at two different minimum mapping units (0.0625 hectare 
and 1 hectare): composition (percentage of five Landsat TM classes), mean patch size 
(hectares, in parenthesis), and edge density (meters per hectare-t; mean value computed 


from 100-hectare cells). 


Orbe Valley Geneva Canton 
0.0625 ha 1ha © 0.0625 ha 1 ha 
Decid (0) . 0 8.2 (0.3) 6.4 (7.3) 
Beech 10.6 (0.2) 4.9 (4.0) 5.4 (0.2) 2.0 (4.4) 
Conif 49.9 (3.1) 58.6 (79.5) 1.6 (0.1) 0.5 (4.9) 
Planted 2 (01 0.0 (2.00) 0.7 (0.1) 0.2 (2.9) 
Open 36.8 (1.3) 36.5 (41.3) 84.1 (9.3) 90.9 (721.4) 
EdgeDen + SD 106.9 + 49.3 A 15) ae SEL 83.9 + 48.7 1 ERE 3: 
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compared to the 100-hectare cell models (from 61.2 
percent to 80.0 percent at 0.0625-hectare MMU and 
from 61.2 percent to 77.8 percent at 1-hectare MMU; 
Table 15.2). Classification rates of absences remained 
fairly constant, around 70 percent at both MMUs 
(Table 15.2). Differences in K values between 100- 
and 400-hectare models therefore resulted from a de- 
crease in omission errors (presences predicted as ab- 
sences), not in commission errors (absences predicted 
as presences). 


Discussion 


The goal of this study was to assess how changes in 
grain of the habitat and distribution data affected the 
classification results of models developed for the green 
woodpecker in two Swiss areas. The results suggest 
that model performance was a function of the land- 
scape characteristics of the study sites, the MMU of 
the habitat map, and the size of the atlas distribution 
grid cells. 

The models developed for the Orbe Valley and 
Geneva Canton comprised different variables, and in- 
creasing the MMU of the land-cover map had differ- 
ent consequences on the predictive ability of the mod- 
els in each site (Table 15.2). Although species-habitat 
associations may vary geographically (e.g. Collins 
1983; Shy 1984), the differences observed between the 
two study sites are more likely an artifact caused by 
the scale of the study and by the variables used to de- 
rive the models. In the Orbe Valley, univariate LR 
analyses showed that the presence of the green wood- 
pecker was negatively associated with the variable 
Conif and, conversely, positively associated with the 
variable Open, and that there was no significant corre- 
lation with edge. The exact opposite was found in 
Geneva Canton: positive correlation with Conif, nega- 
tive correlation with Open, and strong significance of 
the variable Edgeden (P < 0.0005). This apparent con- 
tradiction disappears when the structure and composi- 
tion of the entire landscapes are considered instead of 
composition within individual atlas cells. Indeed, for- 
est patch characteristics differ between the two sites 
(Fig. 15.2). In the Orbe Valley, they tend to be large 
and unbroken. The green woodpecker is known to 
avoid closed, dense coniferous forests, favoring in- 


stead open or broken deciduous or mixed forests with 
grassy fringes or clearings (Cramp 1985; Spitznagel 
1990; Hagvar et al. 1990; Angelstam and Mikusinski 
1994); hence, the negative correlation with Conif. By 
contrast, in Geneva Canton, forest patches are smaller 
and scattered in the agricultural matrix. Although 
considered more an arboreal than a forest species 
(Cramp 1985), the green woodpecker still requires 
forest patches for nesting. Hence, the positive correla- 
tion between the species’ presence and forest classes 
(and the correlated variable Edgeden) in Geneva Can- 
ton. Because the models did not incorporate patch 
configuration attributes such as patch size, fundamen- 
tal differences between the two sites could not be 
taken into account during the modeling phase. 

Scale has been defined as the interaction of grain 
and extent, where grain relates to the level of resolu- 
tion (i.e., MMU) and extent relates to the largest enti- 
ties that can be detected in the data (size of the study 
area or duration of time under consideration; Allen 
and Hoekstra 1991; Turner et al. 1989a,b; 1993). 
Using this definition, the Orbe Valley and Geneva 
Canton study sites were at similar scales; however, be- 
cause of the presence of larger forest patches, the Orbe 
valley can be considered a “coarse-grained” landscape 
whereas Geneva Canton may be considered a “fine- 
grained” landscape (Forman and Godron 1986). Spa- 
tially explicit models that incorporate information 
about patch size and arrangement (Van Horne 1991) 
are likely to have higher predictive capabilities than 
composition-based models, because landscape pat- 
terns exert a strong influence on species’ distribution 
(Hansen and Urban 1992; Gustafson et al. 1994; Le- 
scourret and Genard 1994; Farina 1997). Woodpeck- 
ers, because of their large territories, are likely to be 
affected by the spatial patterning of the landscape 
(Angelstam 1990). Unfortunately, gridded data are 
poorly suited to extracting configuration variables 
such as patch size (Tobalske and Tobalske 1999), so I 
could not test whether the inclusion of spatial vari- 
ables in the models resulted in higher classification 
results. 

The choice of the grid cell size for breeding-bird at- 
lases (and other distribution atlases) is a compromise 
between the level of detail sought and the human re- 
sources available to conduct censuses. A 100-hectare 
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cell size was retained for both areas, but for larger 
sites, even this coarse level of sampling may not be 
possible (Jovéniaux 1993). In the Orbe Valley, increas- 

ing the cell size to 400 hectares almost doubled classi- 
. fication success (K), possibly by clarifying bird-habitat 
association patterns. Heikkinen (1998) suggested that 
distribution patterns of rare plant species richness in a 
Finnish reserve may have been more obscured at the 
1-kilometer grid scale he used for his models than at 
either finer or broader scales. The number of 400- 
hectare absence cells in Geneva Canton was too small 
(Fig. 15.3) to allow models to be developed, so it was 
not possible to assess whether the classification im- 
provement observed in the Orbe Valley was site- 
specific or a more general pattern for the green wood- 
pecker. This illustrates the influence of scale in data- 
collection procedures: using 400-hectare cells, virtu- 
ally all of Geneva Canton appeared suitable for the 
nesting green woodpecker, but this was not the case 
with 100-hectare cells (Fig. 15.3). The proportion of 
cells in which the green woodpecker was predicted to 
be present also increased with increasing cell size in 
the Orbe Valley, where the ratio of presences over ab- 
sences reversed from 0.22 at the 100-hectare scale to 
1.25 at the 400-hectare scale (at the 0.0625-hectare 
MMU). The loss of information resulting from aggre- 
gating distribution squares could have been lessened 
by using an index of abundance, in other words the 
number of 100-hectare cells in each 400-hectare cell in 
which the species was recorded, as input to the LR 
procedure (Gates et al. 1994). 

It is also important to note that, despite being a bet- 
ter measure of accuracy than percent presence and ab- 
sence correctly classified, Kappa also has limitations; 
in particular, it is likely to be sensitive to sample size 
(Fielding and Bell 1997; Fielding, Chapter 21). Aggre- 
gating the 100-hectare cells into 400-hectare cells led 
to a drop in sample size (from 273 to 54 cells at the 
0.0625-hectare MMU and from 273 to 59 cells at the 
1-hectare MMU) and an increase in K values. How- 
ever, if sample size exerted a strong influence on 
Kappa, opposite results (higher K values for the 100- 
hectare cell models) would have been expected, be- 
cause commission error rates have been shown to de- 
cline with increasing number of observations (Karl et 
al., Chapter 51). This was not the case in this study: 


commission error rates were similar, and omission 
rates decreased, between 100-hectare cell models and 
400-hectare cell models. 

In general, models developed from atlas (gridded) 
data may be more prone to omission errors than other 
types of models (e.g., those developed from point 
data) because the position of the grid cells has no rela- 
tion to the spatial distribution of land-cover types. If a 
species is present in a cell dominated by unsuitable 
habitat, as would be the case if the cell happened to 
encompass only the edge of a highly suitable habitat 
patch, then the model will fail to predict a presence. 
This was observed for bird-habitat models in north- 
west Scotland, where grid cells of coastal nests were 
dominated by sea (Fielding and Haworth 1995), and 
for black woodpecker (Dryocopus martius) habitat 
models in the Jura, France, for cells bordering large 
forested patches but composed mostly of open habitat 
(Tobalske and Tobalske 1999). 

Finally, validation should be an integral part of 
model development (Morrison et al. 1992), and when- 
ever possible it should be conducted using an inde- 
pendent data set (Fielding and Bell 1997; Fielding, 
Chapter 21). I did conduct such an external validation 
by applying the models developed in the Orbe Valley 
to data in Geneva Canton and vice versa. In general, 
the models performed poorly when applied to the 
other area. In only one instance (100-hectare cells, 
0.0625-hectare MMU model from the Orbe Valley ap- 
plied to Geneva Canton) was improvement over 
chance classification statistically significant (P < 0.05). 
Increasing cell size or MMU did not improve predic- 
tive accuracy, and models developed with data from 
Geneva Canton performed poorly when applied to the 
Orbe Valley (see Tobalske 1998 for full results). Simi- 
larly, models developed for woodpecker species in the 
adjacent French Jura using a local atlas failed to cor- 
rectly predict species distribution in the Orbe Valley 
and Geneva Canton (Tobalske 1998). Differences be- 
tween the atlases (different grid cell size, surveys con- 
ducted by different people, etc.) probably contributed 
to these results. 

Although limited to one species, two study sites, 
two MMUs, and two atlas cell sizes, the present study 
demonstrates the overwhelming influence of scale on 
model composition and predictive accuracy, and the 
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site-specificity of this influence. In one site, classifica- 
tion results were little affected by increasing the MMU 
of the land-cover data but were sensitive to the grain 
of the species distribution data. In the other, classifica- 
tion success was higher at the finer MMU than at the 
larger one. Extending the study to incorporate addi- 
tional species and scales may further elucidate the re- 
lationships between model predictive accuracy and 
scale, but the results are likely to be site- and species- 
specific. 

Before using atlas data for model development, sev- 
eral issues should be addressed, especially if manage- 
ment is the intended purpose of the models: (1) Is the 
scale of the atlas grid compatible with the goal of the 
study? Clearly, a large cell size (and few, if any, atlases 
use cell sizes smaller than 100 hectares) will not be ap- 
propriate if habitat management is to focus on indi- 
vidual nest sites. (2) Which independent variables 
should enter the model, at what scale, and what is 
their availability? Atlases typically cover relatively 
large areas; the development of remote-sensing tech- 
nology now provides land-cover data for such areas, 
but other habitat data may only be available patchily 
(e.g., snag density). (3) How reliable are the distribu- 
tion maps of the atlas? Atlas quality may vary, espe- 
cially if the area covered is large, and false-negatives 
(failure to report a species in a given cell) are likely to 
occur for rare, secretive, or highly mobile species 
(Johnson and Sargeant, Chapter 33). If reliability is 
questionable, data manipulation may be required be- 
fore proceeding with the modeling phase; Johnson and 
Sargeant (Chapter 33) present several methods for im- 


proving the quality of atlas data. Conversely, models 
developed from atlas data can also be used to improve 
the atlas itself: errors of commission may indicate cells 
in which a species did occur but went undetected (To- 
balske and Tobalske 1999). Finally, (4) modelers 
should consider not only the availability of data for 
model development, but also for external model vali- 
dation. The cost associated with collecting independ- 
ent data may preclude accuracy assessment over large 
areas. If data from another atlas are to be used for 
testing the accuracy of the models, it is essential to 
make sure that they are compatible with data used in 
model development (such as similar cell size and 
amount of effort applied during the census). 
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Wildlife Habitat Modeling in an Adaptive 
Framework: The Role of Alternative Models 


Michael J. Conroy and Clinton T. Moore 


He relationship models (e.g., Verner et al. 
1986b; henceforward habitat models) purport to 
establish a quantitative relationship between measures 
of the physical and vegetation characteristics of a 
habitat (Morrison and Hall, Chapter 2), including 
vegetation composition, structure, and spatial 
arrangement of surrounding habitats, and the pres- 
ence or absence, abundance, or persistence of one or 
more species in a landscape (Morrison and Hall, 
Chapter 2). With the rapid development of geographic 
information systems (GIS) and associated computing 
algorithms, it is now possible to encode mathematical 
rules describing presumed habitat-population relation- 
ships and to rapidly perform complex analyses of the 
predicted impacts of various arrangements of land 
cover and vegetation characteristics. For instance, pre- 
sumed habitat-species occurrence relationships are a 
crucial part of gap analysis (Scott et al. 1993), as well 
as forest-planning tools such as FORPLAN (Johnson 
et al. 1980). 

In this chapter, we review some approaches used to 
evaluate the accuracy and predictive ability of habitat 
models. We suggest that standard model validation 
approaches are ambiguous and that assessment of the 
reliability of habitat models is most meaningful when 
models are a part of formal optimization procedures 
in which management actions are selected so as to 
achieve a specific, quantitative objective. Decision the- 


oretical methods allow for the incorporation of 
sources of uncertainty in this process, one of which is 
model reliability. Finally, we think that most conserva- 
tion decisions are based on a relatively small number 
of assumptions about ecological pattern and process, 
and that formal consideration of models based on al- 
ternative assumptions is needed. Habitat models can 
be thought of as tools for translating alternative as- 
sumptions into testable predictions, and management 
can be thought of as the means of providing the ex- 
periment under which model predictions can be tested 
and models and decisions adaptively improved. 


Assessing Model Reliability 


It is not our intent to provide an exhaustive review ei- 
ther of model assessment in general or of habitat mod- 
els in particular. Nonetheless, several commonly 
agreed-upon principles will be relevant to the ensuing 
discussion. Model parameterization, verification, and 
calibration (Morrison and Hall, Chapter 2) are all 
critical parts of model development (Conroy et al. 
1995). In practice, each of these steps, even if taken, 
may be inadequate to assure a model’s reliability. For 
example, model verification demonstrates neither the 
truth nor the usefulness of a model, only the model’s 
internal consistency; thus, a verified model may 
nonetheless be inadequate for management if it is 
based upon faulty assumptions or logic. Likewise, 
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statistical estimates of model parameters often cannot 
be obtained, especially when key model states or pa- 
rameters simply cannot be observed, as is frequently 
the case for highly parameterized models (e.g., spa- 
tially explicit population models; Conroy et al. 1995; 
Dunning et al. 1995; Pulliam et al. 1992). One ap- 
proach is to use values based on general knowledge or 
assumptions about the animal’s life history, which are 
then adjusted so as to provide overall model agree- 
ment with observations. However, the resulting pa- 
rameter values are not bona fide statistical estimates, 
are likely not unique, and may not have biological 
meaning. A more serious concern is the likelihood that 
prediction beyond the range of data used to calibrate 
the model, frequently necessary in management, may 
prove unreliable. 

Regardless of the method of model calibration, 
there remains the issue of whether the model will in 
fact be useful for making management decisions (Van 
Horne, Chapter 4). Validation (Morrison and Hall, 
Chapter 2) probes beyond whether the model appears 
reasonable, and fits data; it also examines how well 
the model might perform under conditions different 
from those in which the model was constructed. How- 
ever, validation can be difficult in practice, for several 
reasons. First, statistical uncertainty in the data, im- 
precision in model predictions, or both may result in 
low power of statistical validation tests to discrimi- 
nate between observations and predictions (Mayer 
and Butler 1993). Thus, failure to reject the null hy- 
pothesis that the model and data agree is weak sup- 
port in favor of the model; in fact, it might simply be 
an artifact of insufficient sampling effort. Second, field 
measurements must be of appropriate resolution to 
validate a habitat model. Consider an artificial 
example of a species existing in a landscape containing 
three habitat types, each with different predicted val- 
ues (under a habitat model) for the species, depending 
on the habitat quality (Morrison and Hall, Chapter 
2): high (predicted number = 2/10 hectares), marginal 
(predicted number = 1/10 hectares), and low (pre- 
dicted number = 0/10 hectares) (Fig. 16.1a). Suppose 
we are capable of exactly enumerating the population 
in each 10-hectare block, and we observe one animal 
in each block. Under this scenario, we would have ob- 
tained a poor correspondence between the model pre- 


dictions and observations, with the numbers agreeing 
in only 44 percent (six of sixteen) of the comparisons. 
We would probably conclude that this model was “in- 
valid,” or in other words was a poor representation of 
the relationship between habitat and abundance. Sup- 
pose instead that we were only capable of counting 
the total number of animals on each 40-hectare block 
(but could do so without error) and were incapable of 
assigning these counts to habitats other than the array 
of habitats occurring in each 40-hectare block (Fig. 
16.1b). Under this scenario, we would have 100- 
percent agreement between the model predictions and 
observations and might be inclined to consider this a 
valid model. 

The above artificial example, while highly con- 
trived, illustrates the point that the selection of the 
spatial scale is a subjective matter (Trani, Chapter 11) 
but one that may strongly influence the outcome of 
model validation (Laymon and Reid 1986; Elith and 
Burgman, Chapter 24). Reliance upon presence/ 
absence statistics in lieu of counts is another form of 
coarsening of the data that may result in apparent 
concurrence between model and data when finer reso- 
lution of the latter may have resulted in model rejec- 
tion. At an even finer resolution, predictions based on 
abundance or density alone will be inadequate for val- 
idating source-sink or other models in which habitat 
quality cannot be inferred from density, regardless of 
spatial scale (Pulliam 1988; Van Horne 1983; Conroy 
and Noon 1996; Maurer, Chapter 9). 

Sensitivity analysis (Morrison and Hall, Chapter 2) 
is often advocated as a practical alternative to true 
validation. However, these arguments are frequently 
not convincing, particularly if a model is to be used in 
decision making. Rather than assuring decision mak- 
ers of the robustness of the model and of GIS, insensi- 
tivity to input errors should be a warning that the 
model also may be insensitive with respect to making 
predictions. For some applications, it may be sufficient 
that the model is capable of ordering alternative man- 
agement actions, with respect to their relative impact 
on the resource objective (Hamilton and Moller 
1995). However, we are less sanguine than Hamilton. 
and Moller (1995) about even this utility for models 
and instead propose that unverified assumptions and 
unreliable parameter values may render as unreliable 
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a) Predicted 


b) 


Observed 


Figure 16.1. Hypothetical comparison between predictions of a simple landscape model and observations under a null model of no 
habitat affinity; cell values represent predicted and observed counts (a) by 10-hectare block, (b) by 40-hectare block. 


even ordinal statements about the relative impacts of 
various management alternatives; an example of how 
this might occur follows. 


The Role of Alternative Models 


As seen in the previous section, validation of habitat 
models presents serious methodological challenges. 
However, validation alone cannot resolve whether the 
model under consideration is superior to a plausible al- 
ternative model, in particular, a model that may imply 
a different course of management. We address this 
issue in a more appropriate decision-making context. 
Assume that we have a model (M) and an alternative 


model (M’), and that both models are plausible, that 
is, at least some theoretical or empirical support exists 
for each (e.g., Pascual et al. 1997). It may be that we 
have performed model validation tests and both mod- 
els are valid (i.e., neither model is rejected in compari- 
son to the data available). A natural question for a de- 
cision maker is: what difference will it make to my 
decision if I place full faith in model M versus if I place 
full faith in model M’? 

A simple illustration can be used to make this point 
by returning to the artificial example in Figure 16.1. 
Suppose that M corresponds to the model predicting 
that the species has specific affinities, as predicted by 
the map in Figure 16.1a (“Predicted”) and that model 
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M' corresponds to the situation in Figure 16.1a (“Ob- 
served”) in which the species is distributed perfectly 
evenly among the sixteen 10-hectare blocks. Clearly, 
belief in one or the other of these models will make a 
difference in how habitats should be managed. Under 
the M scenario, management presumably would be di- 
rected toward certain habitats—in other words those 
that are preferred by the species, assuming a goal of 
conserving this species. Under the M' scenario, man- 
agement favoring these habitats would appear to be 
unwarranted, particularly given that such manage- 
ment would no doubt have inherent costs (e.g., trade- 
offs with other objectives). Clearly, from the stand- 
point of decision making, injudicious choice of the 
spatial scale for model prediction might potentially re- 
sult in a critical loss of information. 

Note that, depending on the scale at which obser- 
vations are made, the models are both valid and 
therefore equally plausible (Fig. 16.1b), or one model 
appears to have more empirical support than the 
other (Fig. 16.1a). Thus, the usual approach for 
validation—of comparing predictions with independ- 
ent observations—may be indeterminate, depending 
on the scale chosen. Also, note that sensitivity analy- 
sis contributes little to the resolution of model uncer- 
tainty. The fact that either model, or both, depending 
on the spatial scale, is relatively more or less sensi- 
tive to changes in parameter values sheds no light 
upon the question of which (if either) model will bet- 
ter inform decision making. 


Decision Making under Uncertainty 


If habitat or other models are to be useful to man- 
agers, they must be capable of making predictions 
about the consequences of management decisions that 
are better than the educated guesses that managers 
would make on their own in the absence of models. 
The fact that mathematical algorithms can join to- 
gether hundreds or thousands of habitat models and 
rapidly display the results using GIS should be small 
comfort if critical model components are poorly sub- 
stantiated by evidence (Van Horne, Chapter 4). Even 
in those cases where models seem to do a reasonable 
job of prediction, our earlier discussion should con- 
vince readers of the risks of blind application of valid 
models to decisicn-making problems. 


On the other hand, we recognize that decisions 
must be made and that imperfect models, validated at 
inappropriate scales of resolution, or perhaps not at 
all, may be all that are available. Even under ideal cir- 
cumstances, assumptions about biological mechanisms 
will not be perfectly understood, and thus it will not 
be possible to make unambiguous predictions about 
the impacts of management decisions whether or not 
models are used to make these predictions. Obviously, 
biological systems, even if well understood, are subject 
to intrinsic variability, but of special concern here is 
what we term structural uncertainty. That is, more 
than one mechanism (or model) might plausibly ex- 
plain and predict the potential response of the system 
to management, and we are uncertain as to which is 
better for a given management goal. Formally, we are 
faced with making a decision or action, a, from a set 
fus dal. 
Any decision we make will result in an outcome that 
will have a value to us, which we will denote as u(a). 
This value may be in terms of species conservation, 
economic gain, or perhaps a tradeoff between one or 
more goals (e.g., species conservation versus economic 


of possible or feasible decisions a € {a}, ao, . 


gain). Assuming that such an objective value can be 
ascertained or agreed upon, a rational decision maker 
(Lindley 1985) will seek to select that decision that 
will result in the greatest value for the objective. How- 
ever, uncertainty exists as to what the actual objective 
value or utility will be for any decision. First, consider 
uncertainty induced by environmental or demographic 
variability. Let E[u(al6)] represent the average or ex- 
pected value or utility of decision a, assuming that a 
particular model or parameter value (represented by 
9) is known to be true. This value is obtained by aver- 
aging over the statistical distribution of uncertain out- 
comes x resulting from each possible decision a 


E[u(a10 )] = D ulalx; 9) f(xl8 )dx 
fora Gila aa... A 


where u(alx; 0) is the value of decision a given out- 
come x under model 9 and f(xl0) is a statistical distri- 
bution of these outcomes, where 0 is assumed known. 

Until this point, we have assumed that the model 
(as expressed by f(xl8)) is correctly specified, and that 
any deviations of model predictions from outcomes 
must be due to environmental or demographic factors. 
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Here we switch our focus to structural (model) uncer- 
tainty. Let p(0;) represent a probability distribution re- 
flecting our uncertainty in 0, which takes on values 
0; 2=1,...,m under each of m alternative models. 
This uncertainty may include statistical error, but 
more generally includes bias due to incorrect model 
assumptions. The average value of the decision is now 


EROS Y. [ualx;0,) p8) Glo) 
P» NS (16.1) 
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2 p(8;) = 1; when 0 is continuous, the summation op- 
erator changes to integration over 0. By definition, 
the optimal decision satisfies 


max E[u(a)] 


and must be found by averaging over the uncertain en- 
vironmental and demographic conditions (i.e., values 
of x) and structural. uncertainty (i.e., values of 0). Ig- 
noring either source of uncertainty will result in sub- 
optimal decision making. Conversely, reduction of ei- 
ther source of uncertainty will improve decision 
making. Obviously, there is little that can be done 
about environmental and demographic uncertainty, 
beyond including components of each in the decision- 
making model. 

On the other hand, structural uncertainty can be re- 
duced, theoretically to zero, if additional information 
(data) can be obtained that places higher probability 
on certain model structures (values of 0) than on oth- 
ers. We describe below how this source of uncertainty 
can be reduced via adaptive management. For now, 
we focus on the impact of structural uncertainty on 
decision making. Consider a case where an optimal 
decision a is sought, and consider only the average re- 
sponse across environmental and demographic condi- 
tions, assuming that a given model of ecological 
processes is true. Suppose that there are two alterna- 
tive models of this process, which we shall label 64 
and 65, and that our degree of belief in each model is 
p(04) and p(05) = 1 — (01), respectively. The expected 
value of any candidate decision a, taking into account 
only structural uncertainty, is 


E[u(a)]  w(al01)p(01) + w(2102)p(02). 


Clearly, structural uncertainty exists any time that 
0 < (04) < 1. However, notice that this uncertainty is 
only important in the decision-making process to the 
extent that the values of the resulting decisions would 
be different, or in other words 


u(al0,) + u(al05). 


Conversely, if the models predict the same outcome 
for any given decision, or if that outcome is equally 
valued to the decision maker, then uncertainty about 
the ecological process is not relevant to the decision 
process. This can be illustrated by a simple numerical 
example. Suppose that for each of the above two 
model structures we obtain values for decision a of 
4(4101) = 4 and u(al62) = 7. Suppose that there is a 
competing decision, a’, for which the corresponding 
values are u(a’l0,) = 6 and u(a’l8>) = 4. If there is com- 
plete uncertainty about which model correctly de- 
scribes the process, then p(01) = p(05) = 0.5 and the 
values for each decision are given by 


E[u(a)] = 4(0.5) + 7(0.5) = 5.5 
and 
E[u(a’)] = 6(0.5) + 4(0.5) = 5. 


Therefore, the optimal decision is a. Suppose how- 
ever that additional knowledge accumulates (e.g., 
from a monitoring program carried out on the man- 
aged system) so as to place more faith in model 1, 
such that (04) = 0.8. Now the decision values are 


E[u(a)] = 4(0.8) + 7(0.2) = 4.6 
and 
E[u(a’)] = 6(0.8) + 4(0.2) = 5.6 


and the optimal decision is now a’. This approach 
thus places the issue of model reliability (and its reso- 
lution) squarely in the context of optimal decision 
making. That is, we are no longer comparing a single 
model to an arbitrary measure of accuracy but instead 
are asking which decision should we make given two 
or more plausible models and an assessment of rela- 
tive belief in each model. In some cases (e.g., where 
theory or data provide justification), we may be justi- 
fied in giving one model more weight; in others (e.g., 
either model is theoretically justifiable, and both seem 
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valid given current data), we may not. In either in- 
stance, we have an objective means for making a deci- 
sion, taking into account model uncertainty. 


Adaptive Management 


As shown above, model uncertainty must be consid- 
ered, along with other sources of uncertainty, in mak- 
ing optimal conservation decisions. Because our 
knowledge of systems will always be imperfect, and 
parameters will always be estimated with error, model 
uncertainty can never be eliminated. However, model 
uncertainty can and should be reduced. One method 
to reduce model uncertainty is adaptive optimization, 
as incorporated as a part of adaptive resource man- 
agement (Walters 1986). The basic steps of adaptive 
optimization are 


1. Define a resource objective (e.g., species conserva- 
tion, as above) 

2. Delineate a set of feasible management alternatives 
Vig ETE ROT 

3. Develop models 6 € (01, 05, 63,..., Om} that pre- 
dict the impact of the decision on the objective 

4. Identify and quantify the relevant sources of uncer- 
tainty in (3) 

5. Implement the decision that appears to be optimal 
given (4) 

6. Compare predictions under each model to data (x) 
collected following management 

7. Compute a likelihood L(xl8) for each model given 
these data; these likelihoods reflect the relative 
agreement of the observed data to the predictions 
of each model 

8. Update the model probabilities from Bayes’ Theo- 
rem 


p(0,)L(Gde,) 
p(x) =$ pO Llo, 9j € (0;, 05, B E 


yl 


s Om}, (16.2) 


where p(0; | x) is the posterior probability of 6;, 
conditioned on having observed the data x 

9. Incorporate these new model weights in prediction 
and decision making at the next decision opportu- 


nity 


Thus, adaptive resource management provides a 
mechanism for feedback of information following 


management, which in turn reduces model uncertainty 
and promotes further understanding of system 
processes. Because of the long-term nature of many 
conservation problems, that feedback may be slow or 
may not occur at all at a given location (e.g., once a 
reserve is built, there will likely be little interest in re- 
visiting the decision). However, knowledge gained 
through monitoring one system should inform future 
decision making in similar systems. In other conserva- 
tion problems, for instance those involving forest cut- 
ting practices, decisions may be regularly revisited and 
the information gained from one decision cycle will 
provide direct feedback for future decision cycles. 


Case Study: Habitat Management for 
Population Persistence under Uncertainty 


We illustrate the above principles with an example of 
landscape management in which the objective is the 
maintenance of populations of two forest species. The 
two species have resource needs that pose a potential 
conflict for management in the sense that provision of 
resources for one species may remove resources for the 
other. This example, although hypothetical, is similar to 
a problem we are currently investigating involving for- 
est management in the Piedmont National Wildlife 
Refuge (PNWR) in central Georgia. At PNWR, a pri- 
mary management emphasis is that of maintaining vi- 
able populations of the endangered red-cockaded 
woodpecker (Picoides borealis, henceforth woodpeck- 
ers), with the long-term goal of tripling the 1998 refuge 
population (Richardson et al. 1998). However, concern 
exists that aggressive management favoring woodpeck- 
ers, including maintenance of low densities of under- 
story and midstory vegetation via prescribed burning, 
may adversely affect species of birds and other organ- 
isms that depend on these vegetative strata for shelter, 
foraging, or nesting. Previous research (Powell et al. 
2000) has addressed the specific concern that wood- 
pecker habitat management reduces fitness of the wood 
thrush (Hylocichla mustelina, henceforth thrushes) as 
measured by adult and juvenile survival and reproduc- 
tion rates. Results to date suggest that woodpecker 
management, at least as currently practiced at PNWR, 
has minimal if any impact on thrush fitness. How- 
ever, this system exhibits great spatial and temporal 
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variability in demographic parameters, which together 
with estimation error induces uncertainty in these con- 
clusions. Further, there is no assurance that results ob- 
served by Powell et al. (2000) would extend to a more 
aggressive management regime than that which oc- 
curred during the study. Current understanding of the 
effects of woodpecker management (e.g., Richardson et 
al. 1998) may be inadequate to accurately predict 
whether such management would enhance the long- 
term viability of woodpeckers—the primary goal of its 
management. These factors, taken together, suggest that 
forest management at PNWR and similar systems, in 
addition to being influenced by system uncertainty and 
statistical error, may be relatively sensitive to structural 
assumptions in models used to predict the impact of 
management decisions on objective values that include 
both a woodpecker component and a component re- 
flecting other resource goals. 

We describe a simplified, artificial system that 
nonetheless captures some of the essential elements of 
management at PNWR. Here we reduce the resource 
management objective to a tradeoff between a species 
favored by understory vegetation reduction (repre- 
sented by woodpeckers) and another favored by its re- 
tention (represented by thrushes). However, the state- 
ments that “woodpeckers are favored by burning" 
and “thrushes are favored by exclusion of burning” 
result from assumptions in the models underpinning 
the decision analysis. We thus formulated explicit al- 
ternatives to these assumptions to mimic extremes in 
the relationship between population response and 
management that might be consistent with real field 
data. Specifically, we considered alternatives that pro- 
pose that populations do not respond to management 
actions. We incorporated these different hypotheses in 
eight alternative models, as described below. 


System Features and General Assumptions 


The landscape was represented as a 10x10 square 
grid. Any cell in the grid could be occupied by a 
woodpecker, a thrush, or by both species. From an ini- 
tial distribution of woodpeckers within the landscape, 
models predict a resulting distribution of woodpeckers 
following a single 10-year time step. These models al- 
ternatively suggest that woodpecker population 
growth is, or is not, dependent on distance to nearest- 


neighbor source sites, and is, or is not, dependent on 
woodpecker response to habitat management through 
controlled burning. In contrast, we model thrush oc- 
cupancy only in a habitat-suitability context and do 
not consider an initial distribution of thrush. That is, 
following management a thrush occurs in a cell with 
probability that does, or does not, depend on the 
burning status of that cell. Generally, we want to max- 
imize population growth of woodpeckers and density 
of thrushes through appropriate selection of one of a 
few decision alternatives. Our aim is to look at every 
alternative for each combination of models and for 
certain initial distributions of woodpeckers. 


Habitat-occupancy Models 


For the woodpecker, we modeled single-time-step cell 
occupation probabilities conditional on current cell 
occupation status, habitat treatment, and distance to 
nearest occupied cell. That is, we built expressions for 
the conditional probabilities 


PEIXGCL) = x41X,(0) = xo, 240) = dissi bs 


where X;(t) is a random variable indicating occupation 
status of landscape cell i at time t, x; = 0 (unoccupied) 
or 1 (occupied), d;(0) is the decision variable for cell ;, 
do is the decision value (1 = burned, 0 = not burned), 
H; is a random variable, and h; is a distance value. 

Given that landscape cell i is currently occupied by 
a woodpecker (i.e., X;(0) = 1), we used the following 
expression as the model of cell occupancy probability 
at time 1: 


Po do=T — (63) 


Pr{X,(1) -1|x,(0) = 1,4,(0) = d;) -| 
Py > dy =9 

where pọ and po’ are user-selected probabilities. Be- 
cause x, = 0 or 1, Pr(X;(1) = 01 X,(0) = 1, d;(0) = do} = 
1 = Pr(X.(1) 1 | X;(0)e= deed (Ome dom Nhs tice 
probability of woodpecker persistence is sensitive to 
the management decision, where the degree of sensi- 
tivity is reflected in the difference po — po’. 

Given that landscape cell i is not currently occupied 
by a woodpecker, we expressed the probability of cell 
i being colonized at time 1 as a function of distance h; 
to the nearest occupied cell and burning status for 
cell: 
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Pr{X,(1) =1 | X,(0) = 0,d,(0) = d, H, = hj) 


, ee. d,=1 
= ec dy=0 


(16.4) 


where o and f are user-controlled parameters. Thus, 
woodpecker colonization probability is partially de- 
pendent on the spatial distribution of woodpeckers. 
We proposed an alternative model in which coloniza- 
tion probability was not sensitive to h; 


Pr{X,(1) = 1 | X,(0) = 0,d,(0) = d,} 


_ | BaSe ha, dy=1 
opü-e"9)/a, d,=0 


(16.5) 


where a is a user-controlled parameter. We derived this 
model by integrating the functions in equation 16.4 
over the interval 0 to a and then dividing the result by 
the length of the interval to obtain a uniform proba- 
bility mass over O to a. 

For both the woodpecker persistence and the colo- 
nization models, we considered forms in which occu- 
pation probabilities were not dependent on the habi- 
tat decision. For the persistence model, we used the 
expression 


Pr(X(1) 211 X(0) = 1) = (po + po/2. — (16.6) 


For the spatially dependent colonization model, we 
used the expression 


Pr(X41) = 11 X40) = 0, H; = bj = (eiB + e-^/nBy2, (16.7) 
and for the non-spatially dependent model, we used 


Pr(X;(1) = 1] X40) = 0} 


= [B(1 - e) + aB(1- e-weB)/2a. ^ (16.8) 


Combinations of these model structures provided 
four alternative models for the woodpecker response 
to management and woodpecker spatial distribution: 


1. model Wps—decision-sensitive and spatially sensi- 
tive; equations 16.3 and 16.4. 

2. model Wp.—decision-sensitive and spatially insen- 
sitive; equations 16.3 and 16.5. 

3. model W.s—decision-insensitive and spatially sen- 
sitive; equations 16.6 and 16.7. 

4. model W..—decision-insensitive and spatially in- 
sensitive; equations 16.6 and 16.8. 


Unlike probability in the woodpecker models, the 
thrush occupation probability of cell i at time 1 was 
considered to be solely dependent on habitat treat- 
ment in one model alternative (Tp) 


d» dọ= 


(16.9) 
do, dọo=0 


Pr{¥,(1) 11 d,(0) = do} - 


where Y; is the thrush occupation status (either O or 1) 
of cell i at time 1, and qo and qo’ are probabilities set 
by the user. An alternative model (T.) to reflect deci- 
sion-insensitivity for thrushes is 


Pr(Y(1) = 1} = (qo + qo )/2 . 


Thus, the four woodpecker model alternatives in com- 
bination with each of the two thrush model alterna- 
tives yielded eight alternative system models. 


Landscape Simulation 


We simulated effects of decisions under each of the 
species models over a range of initial woodpecker 
conditions. We considered four types of initial condi- 
tion: (1) low woodpecker occurrence (n = 5 cells oc- 
cupied), highly clumped; (2) low occurrence, highly 
dispersed; (3) high occurrence (n = 20), highly 
clumped; and (4) high occurrence, highly dispersed. 
We used a rejection procedure to generate clumped 
and dispersed distributions. We calculated an index 
of clumping K (Krishna Iyer 1949; Pielou 1977) for 
each randomly generated candidate distribution of 5 
occupied cells, and we assumed that the index fol- 
lowed a normal distribution under random mingling 
of cells. We accepted the distribution as a clumped 
distribution if K > 1.282 (normal critical value at 
90th percentile) and as a dispersed distribution if K < 
~1.282. We continued this process until we had gen- 
erated one hundred distributions of each type on the 
landscape grid. 

For each initial distribution of woodpecker occu- 
pancy, we simulated a set of management decisions 
under each of the alternative models. The burning sta- 
tus for cell i, d(0), was a random outcome of the deci- 
sion variables dU, the proportion of woodpecker- 
vacant habitat burned, and d2, the proportion of 
woodpecker-occupied habitat burned. For a fixed 
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selection of d! and d2), 
domly chosen for burning from the set of woodpecker- 
vacant cells, and zd(2 cells were chosen at random 


from the set of woodpecker-occupied cells. We consid- 


(100 - z)d(! cells were ran- 


ered four settings of d(1) and d(2) 
1. {dit), q)) = (0.2, 0.2} 
2. (di), m - (0.2, 0.8) 
3. (d'0, d) = (0.8, 0.2} 
4. (d, d2} = (0.8, 0.8) 


Given an initial distribution of woodpecker and 
1) and d2, we drew one hundred random 
arrangements of the d;(0). Thus, each of the sixteen 
combinations of initial conditions and decision vari- 


values of dí 


ables provided ten thousand random distributions of 
woodpecker occupancy and burning activity. 

All simulations were conducted over a single ten- 
year time step. Values 0.904 and 0.665 for pg and po’, 
respectively, correspond to annual persistence rates of 
0.99 and 0.96; in other words, annual risk of extirpa- 
tion is four times as likely for an unburned cell than 
for a burned cell. We chose values of 0.8 and 0.25 for 
B and o, respectively, which render colonization un- 
likely in any burned cell not adjacent to an occupied 
cell. For unburned cells, colonization is extremely un- 
likely for any nearest-neighbor distance. We chose val- 
ues of 0.1 and 0.6 for the thrush occupation probabil- 
ities go and qo’, respectively. 

For each of the decision simulations, we recorded 
the woodpecker population growth as A = &X;,(1)/n 
and we calculated w = £Y;(1)/100, the proportion of 
habitat occupied by thrushes. We combined these 
quantities in the objective function 


J = {max(0, A — 1)}* w”, 


where u and v were set to the values 1.0 and 0.2, re- 
spectively. These values imply that woodpecker popu- 
lation growth is rewarded approximately linearly as 
long as thrushes occupy a minimum threshold (about 
20 percent) of the landscape. Rewards are minimal if 
the decision grows one species at the expense of the 
other. We obtained means and variances of ten thou- 
sand objective function evaluations for each of 128 
initial condition x model alternative x decision alter- 


native combinations. 


Simulation Results 


Because initial conditions, the decision action, and 
population responses were all realizations of stochas- 
tic processes, values of the objective function were 
also stochastic. Therefore, for any given population 
model, each decision was superior to the others in 
least one simulation simply by chance (Tables 16.1, 
16.2). However, the large number of simulations 
clearly indicated that certain decisions provided the 
greatest expected value of the objective function and 
that others were consistently inferior. 

The optimal decision depended on accurate identifi- 
cation of the underlying management response model 
(Tables 16.1, 16.2; Fig. 16.2). For example, given that 
initial woodpecker population size is 20, then the deci- 
sion to burn 20 percent of both woodpecker-vacant 
and woodpecker-occupied landscape cells is the best 
decision only if one correctly presumes that thrushes 
respond negatively to fire and that woodpeckers do 
not respond at all (Fig. 16.2d-f). However, this same 
decision is the worst that can be made if, in fact, 
woodpeckers respond positively to fire (Fig. 16.2d-f). 
The four decisions were equally adequate only in the 
special case in which neither species responded to fire 
management. 

The parameter values that we chose for the objec- 
tive function heavily rewarded management directed 
toward woodpeckers, and this was reflected in how 
the decision patterns varied among management re- 
sponse models. For the four model types in which 
woodpeckers were not assumed to respond to fire 
management (models W..T., W..Tp, W.sT., W.sTp), 
models W..T. and W.sT. provided no trend in mean 
objective value as extent of burning increased in the 
landscape, whereas the thrush response models 
(W..Tp and W.sTp) provided a negative trend as 
more of the landscape was burned (Fig. 16.2). How- 
ever, all woodpecker response models provided a 
positive trend in mean objective value, though the 
rate of increase was slower when the thrush response 
was considered (Fig. 16.2, models Wp.Tp and 
WpsIp) than when it was not (Fig. 16.2, models 
Wp.T. and WpsT.). As objective value parameters 
are altered to bring management desires for the two 
species into greater conflict, we would expect the 
trend in objective value over the decisions under the 


TABLE 16.1. 


Mean and approximate 99% confidence interval for objective value (J) and frequency of optimality (^ opt) for four decisions under 
eight alternative system models and two types of spatial arrangements of woodpeckers (dispersed versus clumped), given an 
initial population of five woodpeckers. 


Decision? 
d(2 = 0.2, d(2) = 0.2 d(1) = 0.2, d(2 = 0.8 d(1) = 0.8, d(2 = 0.2 d(1) = 0.8, d(2) = 0.8 

Model> J 99% Cl Nopt J 99% CI Nopt J 99% CI Nopt J 99% Cl Mopt 

Initial woodpecker population size = 5, highly dispersed 

WEN: 0.384  (0.376- 2440 0.385  (0.378- | 2400 0.386 (0.378- 2537 0.388 (0.381- 2469 
0.392) 0.393) 0.394) 0.396) 

W..Tp 0.412 (0.404- 2941 O.414 (0.406- 2868 0.347 (0.340- 2095 0.345 (0.338- 2005 
0.420) 0.422) 0.354) 0.352) 

Wp.T. 0.175  (0.169- 438 0.254  (0.248- 809 0.521 (0.512- 3518 0.631 (0.622- 5124 
0.180) 0.260) 0.530) 0.640) 

Wp.Tp 0.187  (0.182- 713 0.267 (0.261- 1154 0.471 (0.462- 3395 0.559 (0.551- 4664 
0.193) 0.274) 0.479) 0.567) 

W.sT. 0.632  (0.622- 2529 0.626  (0.616- 2423 0.631 (0.621- 2436 0.633 (0.623- 2501 
0.641) 0.635) 0.640) 0.642) 

W.sTp 0.674  (0.664- 3059 0.674 (0.664—- 3056 0.576 (0.567- 1994 0.557 (0.548- 1825 
0.684) 0.684) 0.584) 0.565) 

WpsT. 0.161  (0.156- 25 0.237 * (0:231- 45 1.045 (1.033- 4254 1.158 (1.147- 5622 
0.167) 0.242) 1.056) 1.169) 

WpsTp 0.174 (0.168- 59 0.256 (0.250- 107 0.939 (0.929- 4269 1.031 (1.020- 5488 
0.179) 0.262) 0.950) 1.041) 

Initial woodpecker population size = 5, highly clumped 

Waal. 0.385 (0.377- 2447 0.386  (0.378- 2483 0.388 (0.380- 2441 0.388 (0.380- 2487 
0.393) 0.393) 0.396) 0.396) 

W..Tp 0.414  (0.406- 2972 0.409 (0.401- 2881 0.346 . (0.339- 2060 0.341 (0.335- 2013 
0.423) 0.418) 0.353) 0.348) 

Wp.T. ows (017/3 512 0.252  (0.246- 812 0.515 (0.506- 3500 0.625 (0.616- 5050 
0.184) 0.258) 0.524) 0.634) 

Wp.Tp 0.189  (0.183- 728 0.270  (0.263- 1141 0.471 (0.462- 3392 0.559 (0.551- 4670 
0.195) 0.276) 0.479) 0.567) 

W.sT. 0.474  (0.465- 2468 0.475  (0.466- 2447 0.475 (0.466- 2467 0.476 (0.468- 2494 
0.482) 0.483) 0.483) 0.485) 

W.sTp 0.506 (0.497- 2971 0.505  Á(0.496- 2934 0.433 (0.426- 2046 0.428 (0.420- 1977 
0.515) 0.514) 0.441) 0.436) 

WpsT. 0.115  (0.111- 56 0.182  (0.177- 107 0.784 (0.773- 4000 0.903 (0.892- 5761 
0.119) 0.187) 0.794) 0.913) 

WposTp 0.122 (0.117- 97 0.192 (0.187- 191 0.707 (0.698- 4080 0.799 (0.790- 5548 
0.127) 0.198) 0.717) 0.808) 


@Expressed as proportion of ninety-five woodpecker-vacant cells (d(?)) and proportion of five woodpecker-occupied cells (d(2) burned. Each decision 
was simulated one hundred times under one hundred random woodpecker occupancy distributions. 

bModel expressed as a character triplet Wil where i indicates woodpecker colonization and persistence probabilities are (i = D) or are not (i= .) 
sensitive to burning, j indicates woodpecker colonization probability is (j = S) or is not (j = .) sensitive to distance from a nearest-neighbor source 
cell, and k indicates thrush occurrence probability is (k = D) or is not (k = .) sensitive to burning. i 


TABLE 16.2. 


Mean and approximate 99% confidence interval for objective value (J) and frequency of optimality (Nop) for four decisions under 
eight alternative system models and two types of spatial arrangements of woodpeckers (dispersed versus clumped), given an 
initial population of twenty woodpeckers. 


Decision? 
d(1) = 0.2, d(2) = 0.2 d(1) = 0.2, d(2) = 0.8 d(1) = 0.8, d(2) = 0.2 d(1 = 0.8, d(2) = 0.8 

Model> J 99% CI Nopt J 99% CI Nopt J 99% Cl Nopt J 99% Cl — nop: 

Initial woodpecker population size — 20, highly dispersed 

W..T. 0.016 (0.015- 1441 0.016 (0.015-; 1517 0.016 (0.015- 1499 0.015 (0.014- 1429 
0.017) 0.017) 0.017) 0.016) 

W..Tp 0.018  (0.017- 1614 0.017  (0.016- 1554 0.016 (0.015- 1417 0.015 (0.014- 1365 
0.019) 0.018) 0.017) 0.015) 

Wp. 1. 0.002 (0.001- 124 0.015 (0.014- 1179 0.016 (0.015- 1151 0.060 (0.058- 4837 
0.002) 0.016) 0.017) 0.061) 

Wp.Tp 0.002 .(0.002- 175 0.015 (0.014- 1285 0.015 (0.014- 1169 0.056 (0.054- 4725 
0.002) 0.015) 0.016) 0.057) 

W.sI. 0.208  (0.205- 2557 0.205  (0.202- 2409 0.207 (0.203- 2525 0.205 (0.202- 2430 
0.212) 0.209) 0.210) 0.209) 

W.sTp 0.224 (0.220- 3146 0.248  (0.215- 2867 0.194 (0.191- 2122 0.185 (0.182- 1845 
0.227) 0.222) 0.197) 0.188) 

WpsT. 0.015  (0.014- 1 0.061  (0.059- 8 0.363 (0.359- 2964 0.478 (0.475- 6993 
0.016) 0.063) 0.367) 0.482) 

Wpsip 0.017  (0.016- 3 0.066  (0.064- 19 0.344 (0.340- 3374 0.427 (0.424- 6576 
0.018) 0.068) 0.348) 0.430) 

Initial woodpecker population size = 20, highly clumped 

Winellie 0.016 (0.015- 1478 0.017 (0.016- 1515 0.016 (0.015- 1498 0.017 (0.016- 1463 
0.017) 0.018) 0.017) 0.018) 

W..Tp 0.018  (0.017- 1595 0.018  (0.017- 1631 0.016 (0.015- 1441 0.015 (0.014- 1392 
0.019) 0.019) 0.017) 0.016) 

Wp.T. 0.002  (0.002- 137 0.015  (0.014- 1156 0.016 (0.015- 1086 0.061 (0.059- 4872 
0.002) 0.016) 0.017) 0.063) 

Wp.Tp 0.002 (0.002- 169 0.015 | (0.014- 1299 0.015 (0.014- 1173 0.055 (0.053- 4702 
0.002) 0.016) 0.016) 0.056) 

W.sT. 0.149  (0.146- 2473 0.151  (0.148- 2505 0.150 (0.147- 2502 0.148 (0.145- 2426 
0.152) 0.154) 0.153) ORS a5) 

W.sIp 0.160  (0.157- 2970 0.157  (0.154- 2765 0.144 (0.138- 2261 0.133 (0.130- 1964 
0.164) 0.160) 0.144) 0.136) 

WpsT. 0.009  (0.008- 3 0.045 (0.043- SU 0.270 (0.266- 2844 0.383 (0.379- 7071 
0.010) 0.046) 0.274) 0.386) 

WpsTp 0.010  (0.009- 6 0.046  (0.045- 56 0.251 (0.248- 3070 0.342 (0.339- 6836 
0.010) 0.048) : 0.255) 0.345) 


aExpressed as proportion of eighty woodpecker-vacant cells (d(2)) and proportion of twenty woodpecker-occupied cells (d (2) burned. Each decision 
was simulated one hundred times under one hundred random woodpecker occupancy distributions. 

bModel expressed as a character triplet W;T,, where i indicates woodpecker colonization and persistence probabilities are (i = D) or are not (i= .) 
sensitive to burning, j indicates woodpecker colonization probability is (j = S) or is not (j = «) sensitive to distance from a nearest-neighbor source 
cell, and k indicates thrush occurrence probability is (k = D) or is not (k = .) sensitive to burning. 
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Figure 16.2. Mean objective values for four landscape burning decisions under alternative models of red-cockaded woodpecker (Pi- 
coides borealis) and wood thrush (Hylocichla mustelina) population dynamics. Decisions are expressed in the form (d(9, q(2), 
where d(? represents proportion of woodpecker-vacant habitat burned, and a(2) represents proportion of woodpecker-occupied habi- 
tat burned. Shading of decision bars increases from light to dark as extent of landscape burning increases. In each plot, decision 
results are provided for four models of species response to burning: no response by either woodpeckers or thrushes (model W.;T.), 
response by thrushes only (model W.;Tp ), response by woodpecker only (model Wp;T.), and response by both woodpecker and 
thrushes (model Wp;Tp), where subscript i indicates whether woodpecker colonization probability is (i = S, plots b, c, e, f) or is not 
(i= ., plots a, d) sensitive to distance to nearest source neighbor. Decision outcomes vary according to two initial states of the sys- 
tem: initial population size (plots a, b, c for n = 5 and plots d, e, f for n = 20), and initial distribution of woodpeckers (plots b and 
e for dispersed distributions and plots c and f for clumped distributions). 


joint woodpecker-thrush response model to become 
quite flat, almost resembling that for the null re- 
sponse model. 

We found that making the optimal decision was not 
dependent on correctly identifying the appropriate dis- 
tance-sensitivity mechanism for woodpecker coloniza- 
tion (contrast Fig. 16.2a with Fig. 16.2b-c and Fig. 
16.2d with Fig. 16.2e-f), the initial abundance of 
16.2a-c with Fig. 
16.2d-f), or the initial distribution of woodpeckers 
(contrast Fig. 16.2b with Fig. 16.2c and Fig. 16.2e 


woodpeckers (contrast Fig. 


with Fig. 16.2f). With regard to this latter result, how- 
ever, we point out that our simulated decisions were 
carried out by selecting habitat cells completely at ran- 
dom with regard to woodpecker location. Had our de- 
cision set also included the selection of cells under 
some alternative sampling scheme (e.g., probability of 
selection of woodpecker-vacant cells inversely propor- 
tional to distance from woodpecker-occupied cells), 
we would then expect to find that the optimal decision 
does depend on correct identification of the initial dis- 
tribution of woodpeckers. 
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Discussion 


This hypothetical example demonstrates the impor- 
tance of correct model identification in decision mak- 
- ing, or at least the importance of considering a set of 
reasonable model alternatives. Furthermore, the rela- 
tive performance of decisions across the model set will 
vary according to form and parameterization of the 
objective function. 

In our example, we were omniscient observers of 
the system and could easily understand the implica- 
tions of each decision under each version of nature. In 
real systems, however, we are uncertain about the true 
version of nature, and our observations are incom- 
plete and imprecise, yet we still are faced with making 
an optimal decision for a management objective. Our 
real need, therefore, is twofold: to find the decision 
that maximizes some physical attribute of the system, 
and to apply the results of the decision action toward 
the reduction of uncertainty and toward better deci- 
sion making in the future. 

Suppose that we are managing a system that is de- 
scribed by one of the eight models above, but that we 
are completely uncertain about which model is cor- 
rect. We will also assume that the initial population of 
woodpeckers is five and that woodpeckers occur in a 
dispersed pattern. Then we may apply equation 16.1 
to find the optimal decision under uncertainty, using 
values of J (Table 16.1) for the | u(alx;0;)f(xl0;\dx and 
p(0;) = 0.125. The maximum value of E(a) is 0.663, 
which occurs for decision {d(1), d2)) = (0.8, 0.8}. 

Following the decision action, the system may be 
observed for a number of years until the time of the 
next management action. We assume that data are col- 
lected according to some design that yields observa- 
tions at temporal, spatial, and demographic resolu- 
tions that are consistent with model predictions. 
Suppose that data from the field, collected ten years 
following the decision action, provided a set of values 
L(xl0), the statistical measures of agreement between 
the data and each model 0 (e.g., a sum of squared dif- 
ferences between observations and model predictions 
scaled by a variance measure). Furthermore, suppose 
that these values redistributed (through equation 16.2) 
model weight from the equal allocation of 0.125 for 
each model to the allocation of 0.79 to the model 


W.sTp (woodpecker management-insensitivity, wood- 
pecker distance-sensitivity, thrush management- 
sensitivity) and 0.03 to the other seven models. Now, 
if we are again required to choose a management ac- 
tion for the next ten years, and again starting from an 
initial condition of five dispersed woodpeckers, then 
reapplication of equation 16.1 under these new 
weights results in a maximum value of 0.606 for E(a), 
which occurs for decision {d(1), d(2)} = (0.2, 0.8}. Thus, 
at both decision periods we not only made optimal de- 
cisions under system uncertainty, we also exploited 
our decision action and our monitoring data to reduce 
uncertainty between decision episodes. Note also in 
this approach that statistical measures of model-to- 
data agreement are not used to make dichotomous, 
absolute assignments of model validity or invalidity 
based on arbitrary criteria. Rather, they are used in a 
way that allocates more or less credibility to a model 
over time without ever completely dismissing a con- 
tender from the model set. 


Summary 


We make several observations regarding the use of 
habitat models in a conservation decision context. 
First, the assessment of model accuracy (model vali- 
dation) must be based on observable phenomena so 
that model predictions can be directly compared to 
observations. “Suitability,” meaning the potential of 
a habitat to provide a portion of the needs of a pop- 
ulation, cannot itself be objectively measured, and 
models dependent on “suitability” as the output 
cannot be validated (but see Hill and Binford [Chap- 
ter 7] for another perspective). Second, even when 
presence, absence, or numerical abundance can be 
observed and appear to conform to model predic- 
tions, this agreement may constitute weak valida- 
tion, that is, not exclude competing explanatory or 
even null habitat models. Weak validation occurs 
for several reasons, including (1) possible existence 
of source-sink and other demographic phenomena 
tending to obscure functional relationships between 
habitat and populations; (2) weak evidence based on 
qualitative (e.g., present or absent) versus quantita- 
tive comparisons; (3) lack of statistical power; and 
(4) injudicious choice of spatial scale. Third, arbi- 
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trary conventions of accuracy (map or attribute) or 
precision are irrelevant to decision making and tend 
to distract from the proper consideration of uncer- 
tainty in decision making, which will always be 
made under uncertainty. The key is to provide ten- 
able alternative models that make different predic- 
tions about the relationship between management 


actions and the objective. Finally, optimal decisions 
can be made based on current information about the 
tenability of alternative models (expressed as model 
weights). Adaptation occurs when uncertainty is re- 
duced (changing model weights) by information 
feedback obtained in the course of management and 
monitoring. 


CHAPTER 
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Contrasting Determinants of Abundance in 
Ancestral and Colonized Ranges of an 
Invasive Brood Parasite 


D. Caldwell Hahn and Raymond J. O’Connor 


ae Parasitic species functions on a differ- 
ent ecological scale than any of its individual 
hosts. Characterizing the niche of a generalist parasite 
requires sifting through the complex set of environ- 
mental variables underlying the distributions of its 
multiple hosts, then using an analytical technique that 
can distinguish between the relative influence of the 
environmental factors and the presence of the hosts 
themselves. The brown-headed cowbird (Molothrus 
ater) (henceforth “cowbird”) is an obligate parasite 
that never builds its own nest, and it is an extreme 
host-generalist that parasitizes over two hundred 
species of North American passerines (Ortega 1998). 
The cowbird switches among multiple host species in 
different geographic areas of its range, and it para- 
sitizes hosts with broad geographic ranges as well as 
hosts with ranges limited regionally or by habitat. The 
cowbird has a broad geographical range that covers 
most of the continental United States (see Fig. 17.1 in 
color section), an extent that few North American 
songbirds can match (Price et al. 1995). However, the 
range also has two distinct areas: an ancestral range 
and a more recently colonized area. The ancestral 
range lies in the plains and prairies of the central 
Great Plains, where cowbirds associated with migra- 
tory buffalo. The invaded range is distributed both 
east and west of the central United States and stretches 
to the Atlantic and Pacific coasts (Rothstein 1994). 


The cowbird’s range expansion occurred in associa- 
tion with European colonization of North America, so 
its occupation of the eastern United States is approxi- 
mately 350 years old and its occupation of the western 
United States may be as recent as 150 years. The cow- 
bird coexists successfully with domestic livestock and 
agriculture and also exploits suburban lawns and bird 
feeders (Ortega 1998). 

We hypothesized first that the distribution of the 
parasitic cowbird would be less influenced by climate 
and weather factors than are the distributions of other 
songbirds, and, second, that different niche attributes 
would characterize the cowbird’s ancestral and colo- 
nized ranges. 


Methods 


Our analysis was based on mapping abundance and 
environmental variables to a spatial grid, followed by 
statistical analysis to relate cowbird abundance at 
each location to the environmental conditions and 
host densities there. Our spatial grid was the hexago- 
nal grid developed for the U.S. Environmental Protec- 
tion Agency (EPA) for use in the Environmental Mon- 
itoring and Assessment Program (EMAP) (White et al. 
1992). Each hexagon was approximately 640 square 
kilometers in area and approximately 12,600 hexa- 
gons cover the conterminous United States. A hexago- 
nal grid, unlike a square grid, has a constant center-to- 


DAS, 
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center distance between adjacent grid cells (here 27 
kilometers). 

For predictor variables we used the land cover class 
and environmental data compiled by O'Connor et al. 
(1996). Loveland et al. (1991) used Advanced Very 
High Resolution Radiometry (AVHRR) meteorologi- 
cal satellite images to derive a prototype land cover 
classification for the conterminous United States at 
1.1-square-kilometer sensor resolution. O'Connor et 
al. (1996) added an urban class from the Digital Chart 
of the World (Danko 1992), summarized the represen- 
tation of each of the 160 cover classes for each of the 
12,600 EMAP hexagons, and computed landscape 
metrics such as patch size distributions, shape com- 
plexity, contagion and dominance, fractal dimension, 
types and frequencies of habitat edges, road abun- 
dance, and total length of riparian systems for each 
hexagon (Hunsaker et al. 1994; O'Connor et al. 
1996). Several climate variables—annual precipita- 
tion, mean January and mean July temperatures, and 
annual temperature variation (seasonality)—in the 
form of long-term climate averages from the Histori- 
cal Climatology Network (Quinlan et al. 1987; HCN 
1996) were also incorporated. The data were modeled 
with 1-kilometer resolution (except that precipitation 
was modeled to 10 kilometers and then resampled to 
1 kilometer) and were then summarized within each 
hexagon as average, minimum, and maximum values. 
Other variables included in the environmental data set 
were ownership (federal or nonfederal), road density 
(separately for major and minor roads), and stream 
density. All were expressed as within-hexagon aver- 
ages and corresponding extrema (O'Connor et al. 
1996). 

The bird data analyzed came from the national 
Breeding Bird Survey (BBS) (Sauer et al. 1997). The 
BBS comprises bird surveys of a stratified random 
sample of 25-mile (40-kilometer) lengths of secondary 
roadside; for each of fifty stops spaced at half-mile 
(0.8-kilometer) intervals, an observer records all birds 
registered in a three-minute count. Criteria concerning 
timing, weather conditions, and so on must be met for 
the route to be judged of acceptable quality for inclu- 
sion in the survey. In the present analysis, some 1,223 
routes within the conterminous United States with at 
least seven high-quality surveys between 1981 and 


1990 were used (following O'Connor et al. 1996). For 
each species of interest (below), the proportion of sur- 
veys in which the species was recorded at the site was 
computed, to provide a measure of incidence; this 
measure is typically correlated with absolute density 
and is a relatively robust measure of abundance. To 
obtain an overall index of cowbird host abundance 
the incidence values for the different species in two 
lists of host species were summed. One list was of the 
fifty most frequently parasitized host species and the 
other was of geographically widespread hosts (see Ap- 
pendix). In addition, the numbers of these hosts pres- 
ent at each location were determined and used as a 
predictor variable in analyses below. 


Statistical Analysis and Modeling 


We used classification and regression tree (CART) 
modeling (Sonquist et al. 1973; Breiman et al. 1984) 
to identify the nonlinear relationships between our re- 
sponse variables and the land-use, pattern metric, and 
climate covariates. Traditional linear regression and 
correlation techniques assume that independent vari- 
ables entering the regression model have common ef- 
fects across the entire sample, an assumption unlikely 
to be the case here. Moreover, these techniques require 
explicit specification of terms for interactions. We 
used the S-Plus (MathSoft. 1995, Seattle, Wash.) im- 
plementation of CART (Clark and Pregibon 1992; 
Venables and Ripley 1994) to partition our response 
variables recursively with respect to a set of selected 
covariates. At each node, the independent variable 
that best discriminated the response variable was used 
in the tree as the splitting variable for that node. Dis- 
crimination was maximized by trying all possible 
splitting thresholds for all possible prediction vari- 
ables and choosing the variable and threshold to max- 
imize the differences in the response variable (maxi- 
mum between-group diversity) before splitting the 
dataset into two subsets. The process was then re- 
peated independently and recursively on each increas- 
ingly homogenous subgroup until a stopping criterion 
was satisfied. This tree was then pruned back using 
tenfold cross-validation (Clark and Pregibon 1992). 
This strategy reduced the propensity of CART models 
to over-fit the data. Since cross-validation is currently 
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the subject of debate among statisticians (e.g., Miller 
1994b) we used the criteria developed by Sifneos and 
her colleagues (J. Sifneos, D. White, and N. S. 
Urquhart personal communication) on the basis of ex- 
tensive experiments in optimizing cross-validation re- 
covery of known data structures. We also perturbed 
the response variable by 5 percent, and re-ran the 
model to check for overall consistency in tree struc- 
ture. We controlled for collinearity problems by ran- 
domly perturbing each independent variable in the 
pruned model by up to 5 percent and re-running the 
analysis to check for inclusion or omission of the vari- 
able in the tree. Variables stable in the face of such 
perturbation could not be markedly collinear with any 
other variable in the data set. The models presented 
here passed all these checks. 


Geographical Delineation of Regions 


We operationally defined the ancestral zone by over- 
laying the areas of highest cowbird abundance in the 
present-day distribution (Fig. 17.1) on a map of 
Omernik ecoregions (Omernik 1987) and defining the 
range as those ecoregions with cowbirds. The delin- 
eated area ranged from North Dakota south to Texas 
and east to Indiana, Kentucky, and western Tennessee 
and Mississippi. 


Results 


Our findings provide insight into the relative impor- 
tance of physical and biotic variables in predicting the 
occurrence of cowbirds. 


Environmental Factors Determining 
Cowbird Distribution 


Breeding Distribution. Our first analysis examined 
the relative importance of different predictors of 
brown-headed cowbird incidence at a national scale. 
The model explained a significant percentage (50.9 
percent) of the variance in cowbird abundance and 
identified five major predictors: crops, Conservation 
Reserve Program (CRP) lands, geography (region), 
weather, and climate (Table 17.1). The single best 
predictor of cowbird incidence was crops occurrence, 
which explained 15.7 percent of the variance in cow- 


TABLE 17.1. 


Major biophysical predictive factors influencing brown-headed 
cowbird (Molothrus ater) distribution. 


Summer Winter 

Factors distribution (%) distribution (%) 
Climate : 5.9) 29i 
Weather 9:0 4.8 
Geography/region 9.6 5.9 
Conservation reserve program 11.3 = 
Crops AUS) 7 4.0 
Total R?  . 50.9 36.4 


bird incidence. Specifically, soybeans, maize, sun- 
flowers, and sorghum (in that order) each accounted 
for 3.8 to 2.1 percent of variability in cowbird pres- 
ence; other crops were combined in a fifth category. 
The second-best predictor (11.3 percent) was loca- 
tion of CRP lands, which are formerly farmed areas 
that have been allowed to go fallow for ten years. 
The other three major predictors identified were ge- 
ography/region (9.6 percent), weather (9.0 percent), 
and climate (5.3 percent). The geography/region 
variable comprised longitude (7.6 percent) and lati- 
tude (2.0 percent) components. 

Winter Distribution. We compared the major pre- 
dictors of cowbird incidence on wintering grounds to 
those on breeding grounds (Table 17.1). Four of the 
five variables proved to be major predictors for both 
distributions, although in winter the relative impor- 
tance shifted away from crops and toward climate. 
The CRP dropped out in this winter analysis. 


Influence of Host Abundance on 
Cowbird Distribution 


When we repeated our analysis of the cowbird's sum- 
mer distribution with the host variables included, the 
list of major predictors shifted dramatically from 
crops to host abundance (Table 17.2, column A). 
Crops and CRP lands no longer emerged as major pre- 
dictors, although these factors had accounted for 27 
percent of the variance in the previous analysis of 
cowbird's distribution. Host Abundance Index rose to 
the top of the list of major predictors (18.9 percent) 
and accounted for more of the variance in cowbird in- 
cidence than had any single variable in the previous 
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TABLE 17.2. 


Major biophysical and avian predictive factors determining distribution of the brown-headed cowbird (Molothrus ater). 


A 
Cowbird range with 


Predictive factors hosts (national) (%) 


Ancestral cowbird 
range (regional) (%) 


Cc D 
Colonized cowbird Grassland passerines' 
range (regional) (%) range? (96) 


Crops 7.46 

Climate and weather 16.2 

Geography = 

Host abundance 18.9 

Overall species richness 4.9 

Total R2 47.1 2m8 


12.8 


— 33 
6.1 1819 
4.6 TETA 
26.6 NA 
8.3 NA 
45.6 59.6 


40’Connor et al. (1999:51): obligate grassland passerines of North America: horned lark (Eremophila alpestris), vesper sparrow (Pooecetes 
gramineus), lark bunting (Calamospiza melanocorys), savannah sparrow (Passerculus sandwichensis), grasshopper sparrow (Ammodramus 
savannarum), Baird’s sparrow (Ammodramus bairdii), Henslow's sparrow (Ammodramus henslowii), McCown's longspur (Calcarius mccownii), 
dickcissel (Spiza americana), bobolink (Dolichonyx oryzivorus), eastern meadowlark (Sturnella magna), western meadowlark (Sturnella neglecta). 


PAll involved patch size of cropland, pure or mixed. 


analysis. A related biological factor, overall species 
richness, also emerged (4.9 percent). 

We present the details of this analysis as they ap- 
pear in the CART tree in order to explain the sequence 
and interrelations of the principal factors (Fig. 17.2). 
The overriding prominence of host abundance in pre- 
dicting cowbird presence was reflected in its index 
being the splitting factor at the first tier of the CART 
analysis. As reflected in the right branch of the tree, 
when host abundance was high (i.e., a Host Abun- 
dance Index value for the route greater than 5.4), 
cowbirds were recorded in 93 percent of the surveys 
there. Host abundance was more important than over- 
all avian abundance or diversity, though total species 
richness did appear as a splitting factor in the second 
tier of the left branch (segregating the locations in end 
node A from those in nodes B through E). 

The predictive values second and third in impor- 
tance in this analysis were climate (16.2 percent) and 
patch variables (7.1 percent). Among the climate vari- 
ables, seasonality—differences between mean January 
and mean July temperature—was the major contribu- 
tor. Areas of high seasonality are those that experience 
a strong seasonal flush of productivity and that attract 
a rich assemblage of breeding species that take advan- 
tage of the abundant food resources (Ashmole 1963). 
Figure 17.2 shows that on routes associated with the 
higher seasonality index (in nodes C, D, and E on the 
left branch of the tree) 77 percent of the surveys 


recorded cowbirds versus a 35 percent incidence for 
less-seasonal areas (node B). On the right branch, in 
colder areas associated with lower maximum January 
temperatures (nodes F, G, H, and I) 94 percent of the 
surveys recorded cowbirds against only 5 percent of 
the surveys in the warmer areas of node J. Thus, the 
northern United States has many more cowbirds than 
the southern United States but in association with dif- 
ferent predictors in different regions. 

The specific land cover classes for which patch 
variables were predictors were dominated respec- 
tively by crop/grassland mixtures (node C versus 
nodes D and E), small forests of maple-birch-beech 
in corn-soybean areas (nodes F and G), and row 
crops (nodes H and I). In the case of the row crops 
variable, it was the size of the patches of row 
crops relative to the national average for patches of 
this type that had predictive power: hexagons in 
which blocks of row crops were relatively small— 
less than about 5 percent of the national average— 
were more likely to hold cowbirds than were hexa- 
gons with larger expanses of row crops. Patch 
variables can point to habitat fragmentation effects, 
and their emergence here indicates their strong pre- 
dictive value in determining cowbird abundance. All 
three patch-size variables identified are involved in 
area sensitivity, with cowbirds more abundant in 
hexagons with small rather than large patch sizes 
(note the higher value in the left-hand nodes in each 


Host abundance index «5.4 


host abundance index>5.4 


Nr. species<42.5 Max. Jan. temperature«14. 5 deg F 
Nr. species»42.5 Max. Jan. temperature» 14. 5degF 
[A] EE 


Max. seasonality«16. 5 deg 5 Avg. July temp.«17.6 deg F 


" i 5 deg F Avg. July temp.>17.6 deg F 
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Crops/grass patch«3. 7 sq km Wood/crops patch<2. B sq km Rowcrop rel. patch size<0. 052 
Cog patch>3 .7 sq km Wood/crops patch>2. 8 sq km BOLETO rel. we size>0.052 
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Top shos P E TE index>3.4 
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Figure 17.2. Classification and regression tree (CART) model for brown-headed cowbird abundance across the conterminous United 
States. Numbers inside the oval (intermediate nodes) or rectangles (end nodes) are the percentage of Breeding Bird Survey (BBS) 
routes, rounded to whole numbers, on which the brown-headed cowbird (Molothrus ater) was detected. The splitting variable and its 
threshold value are shown at each node. The end nodes are labeled from A through J and represent a set of hexagons with the char- 
acteristics of the unique set of splitting variables preceding it. 
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TABLE 17.3. 


Predictive model for the probability of occurrence of brown-headed cowbirds (Molothrus ater).4 


Probability that 


Host abundance Total species Seasonality cowbirds are 
Alternative index richness index index present (% routes) 
1; If » 5.4 spp / route, [any value) [any value] then 93 
2. If € 5.4 spp / route, and 
a. » 42 spp, and » 16.5 deg. F, then 76.7 
b. > 42 spp, but < 16.5 deg. F, then 36.9 
Sy If € 5.4 spp / route, but < 42 spp, [any value] then 26.2 


?Other predictors (e.g., climate, land use, and habitat patchiness) modify these rules locally. 


of the sibling nodes involving patch variables: C ver- 
sus the DE parent, F versus G, and H versus I). 


Predictive Model for the Probability of 
Occurrence of Cowbirds 


The information depicted in Figure 17.2 can be recast 
as a series of explicit predictive rules (Table 17.3). 
This hierarchy of rules specifies (1) if Host Abundance 
Index in a hexagon or on a route is greater than 5.4 
species/route, then the probability of cowbirds being 
present is 93.5 percent; (2) if the Host Abundance 
Index is less than or equal to 5.4 species/route, then 
two additional factors must be assessed to make a pre- 
diction, yielding (a) if the number of species present 
(total of hosts and nonhosts) is greater than forty-two 
species and the seasonality index is greater than 16.5 
degrees Fahrenheit, then the probability of cowbirds 
being present is still high (76.7 percent); otherwise (b) 
in hexagons where Host Abundance Index is less than 
or equal to 5.4 species and the total number of avian 
species present is greater than 42, but the seasonality 
index is less than 16.5 degrees Fahrenheit, the proba- 
bility of cowbirds being present is only 36.9 percent; 
and finally (3) if the Host Abundance Index is less 
than or equal to 5.4 species and the total number of 
avian species present is less than 42 species, then the 
probability of cowbirds being present—regardless of 
seasonality index—drops to 26.2 percent. 


Role of Host Species' Habitat Preferences 


Shared habitat correlates between host and parasite 
might explain vrhy host abundance figured so promi- 


nently for cowbirds in Table 17.2. Our perturbation 
tests (see “Methods”) precluded simple confounding 
of variables but more complex commonality in habitat 
requirements would not necessarily be excluded. We 
therefore analyzed host abundance data with respect 
to land cover and climate variables and compared the 
spectrum of predictors against that for cowbirds. Pre- 
cipitation was the strongest predictive factor (44.2 
percent): host abundance was low where precipitation 
was low; intermediate in wetter areas where total 
species richness was low and also in areas where 
species richness was high but seasonality was weak; 
and highest where seasonality was high. The second 
and third major predictive variables were species rich- 
ness (14.2 percent) followed by seasonality (8.8 per- 
cent). Since the variables that predict host abundance 
are quite different from those identified for the brown- 
headed cowbird (Table 17.1), it is unlikely that the 
cowbird-host abundance association above was due to 
shared habitat requirements. 


Regional Analysis: Ancestral Versus 
Colonized Range 


Host abundance was a major predictor in both ances- 
tral and colonized regions, although its influence rela- 
tive to biophysical factors was markedly different in 
the two regions (Table 17.2, columns B and C). In the 
ancestral range, biophysical factors (topography and 
elevation) carried over four times as much influence 
(24 percent) on cowbird incidence as did biological 
ones (5.5 percent). In this model, the principal host 
abundance factor was presence of host species with ge- 
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ographically broad distributions, and the level was 
similar to that found for overall species richness (4.9 
percent) on the previous national-scale analysis (Table 
17.2). The major host abundance indicator that had 
appeared on the national scale analysis, or in other 
words the extent of the presence of the fifty most fre- 
quently parasitized hosts (18.9 percent), did not 
emerge as a major predictor within the ancestral range. 

In contrast, within the colonized range the relative 
importance of biological versus biophysical factors 
was reversed, and host abundance indices carried five 
times as much influence on cowbird incidence as did 
biophysical factors (Table 17.2, column C). Specifi- 
cally, host abundance indices accounted for 34.9 per- 
cent of the variability in cowbird incidence, the high- 
est value for a predictor in any of our models of 
cowbird incidence. Within the general category of host 
abundance, the specific factor of abundance was an 
index of the fifty most frequently parasitized cowbird 
hosts (26.6 percent). 

Among the biophysical factors that emerged as pre- 
dictors in both the ancestral and the colonized range, 
climate appeared in the cowbird's ancestral range at a 
level (12.8 percent) similar to that in the national 
analysis (16.2 percent) (Table 17.2) and with the same 
components, specifically high average July tempera- 
ture (8.6 percent) and the degree of seasonality (4.2 
percent). Topography, as the importance of lack of el- 
evation, accounted for 9.5 percent of the variation in 
the model but the crops and patch variable predictors 
of the continental analyses were not important. The 
total variance accounted for in this CART model for 
the ancestral cowbird range was 27.8 percent, a level 
approximately half that in previous models at the na- 
tional scale. In the region of colonized range, the phys- 
ical factors that emerged were topography (4.6 per- 
cent) and climate (6.1 percent) (Table 17.2). The total 
variability accounted for by the CART model for the 
colonized range was 45.6 percent, again similar to the 
levels explained by CART in both the national-scale 
analyses (Table 17.2). 


Discussion 


The brown-headed cowbird is recognized as a text- 
book example of a species that must be studied at dif- 


ferent scales to answer different questions (Robinson 
1999; Morrison and Hahn, in press). Before 1990, 
most studies of the species were local-scale field stud- 
ies conducted at sites in the cowbird's ancestral range, 
and they typically looked at parasite behavioral strate- 
gies such as.nest-searching or mating system and at 
the rates and effects of parasitism on different host 
species (see Ortega 1998). Attempts to extract general 
principles from local studies often yielded contradic- 
tory patterns that reflected the large number of host 
species and habitats exploited by the brood parasite. 

Since 1990, much work has been done at larger 
areas (extent and resolution) and at sites in the colo- 
nized range of the brown-headed cowbird (see Morri- 
son et al. 1999; Smith et al. 2000). The design and lo- 
cation of these studies reflect recent interest in 
investigating which ecological factors are driving the 
cowbird's expansion into new habitats and new hosts, 
particularly in the forest interior. Facilitated by ad- 
vances in radiotelemetry and GIS (geographic infor- 
mation system) techniques, many of these studies were 
conducted across landscapes, with a few at the re- 
gional scale (Robinson et al. 1995a; Morrison et al. 
1999; Smith et al. 2000). 

From these studies, generalizations emerged that 
apply within landscapes but questions remained as to 
how these patterns might change at larger scales and 
what other patterns might appear at larger areas. 
Robinson (1999) summarized the core conclusions 
from landscape and regional studies: cowbird abun- 
dance and parasitism rates are much lower in forested 
landscapes where foraging opportunities are limited; 
cowbird abundance decreases with distance from rich 
feeding areas; and cowbird abundance is correlated 
with host abundance in landscapes with unlimited for- 
aging habitat. The only national-scale analyses of cow- 
bird abundance are three studies of population trends 
based on BBS data, which concluded that cowbird 
numbers are stable nationally, with slight regional in- 
creases or decreases in some regions that are not linked 
to declines or increases in host populations (Maurer 
1993; Peterjohn et al. 2000; Wiedenfeld 2000). 

Our study provides the first national perspective on 
the principal factors that underlie the abundance of the 
parasitic cowbird. Our findings unambiguously identi- 
fied host abundance as the fundamental predictor of the 
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cowbird's distribution at the national scale. Although 
avian species distributions are typically constrained by 
spatially extensive variables such as climate, habitat, 
spatial patchiness, and microhabitat attributes, we had 
hypothesized that the distribution of a brood parasite 
depends as strongly on host distribution patterns as on 
biophysical factors. Our findings suggest that the distri- 
bution of hosts does indeed take precedence over habi- 
tat attributes in shaping the cowbird's distribution at a 
national scale, within an envelope of constraint set by 
biophysical factors. The importance of hosts can be 
missed, because an analysis of the predictive values of 
biophysical factors alone (Table 17.1) yields the result 
that crops and CRP lands are dominant predictors, a 
result that fits the profile of the brown-headed cowbird 
as a bird of the central prairies that associates with 
agriculture. 

Many studies have weighed the relative importance 
of host availability versus food ability. Our results 
suggest that while host availability is predominant for 
the national distribution of cowbirds, food availability 
(associated with the three patch variables that include 
foraging habitats) is a major factor in particular habi- 
tats. Three different patch types were detected in 
widely separate nodes in our CART model for cow- 
bird abundance reflecting the importance of this land- 
scape pattern. Robinson (1999) concluded that host 
abundance was the most influential environmental 
variable, with the caveat that this was true only when 
food is sufficient. Several studies have found food 
availability a more fundamental determinant of cow- 
bird incidence, particularly in habitats such as those 
where forested areas are extensive (Morrison et al. 
1999). The influence of food availability on the winter 
distribution of cowbirds is a separate question, also 
much debated, and our analysis showed a sharp repo- 
sitioning of the major predictors from summer to win- 
ter distribution, dropping crops to 4.0 percent and in- 
creasing climate nearly fourfold to 21.7 percent (Table 
17.1). These results suggest that either food availabil- 
ity is a significant constraint on cowbird populations 
(Robinson 1999) or energetic constraints limit the 
cowbird's ability to exploit cold areas (Root 1988c). 

Our CART analysis of summer distribution also 
distinguished the influential size of patch variables and 
found that cowhirds are more abundant in hexagons 


with small rather than large patch size. This result 
confirms the observations of many local studies that 
larger forest stand size limits cowbird incidence (Mor- 
rison et al. 1999). Since the reverse pattern of area 
sensitivity characterizes Neotropical migrants (i.e. 
they are more abundant in hexagons with large patch 
sizes), this result indicates that separate niches still 
exist between cowbirds and forest-interior nesting 
birds. 

Our analysis identified climate and weather as the 
second most important predictors. Local- and land- 
scape-scale studies have rarely addressed climate and 
weather, which illustrates how an overlooked variable 
can emerge when an analysis shifts from a local to a 
continental scale. The areas of high seasonality in the 
north-central region of the United States experience a 
strong seasonal flush of productivity that attracts a 
rich assemblage of breeding species that take advan- 
tage of the abundant food resources (O'Connor et al. 
1996). These are the areas of greatest cowbird abun- 
dance. Seasonality is very highly correlated with the 
portion of Neotropical migratory songbirds in an 
area, a pattern first hypothesized by Ashmole (1963) 
and since supported by Wilson (1974), Herrera 
(1978), and Ricklefs (1980). Thus, cowbirds can ex- 
ploit both the flush of productivity and the abundance 
of breeding hosts. 

The importance of host abundance to cowbird dis- 
tribution is further put into perspective by the regional 
analysis. Distinguishing the cowbird's ancestral range 
from its colonized range revealed a strong geographi- 
cal bias in the influence exerted by host species. In the 
colonized range, host abundance indices carried five 
times as much influence as biophysical factors. This 
result may reflect the fact that there is greater variance 
in incidence of cowbirds in the colonized range and 
that cowbirds colonize new areas only where condi- 
tions are good. In the East, although cowbird hosts 
are more abundant, much of their breeding habitat is 
in large forests, where they are inaccessible. In the 
West, host species are concentrated in riparian areas, 
which draws cowbirds to those sites. We plan more 
detailed analyses of the colonized range in which we 
look at the eastern and western ranges separately, 
since both the habitat types and host abundance levels 
are distinctly different (O'Connor et al. 1996). 
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Host abundance is not the dominant predictor of 
cowbird incidence in its ancestral range, although it 
does emerge as a lesser predictor. Here topography and 
elevation carried over four times as much influence as 
biological factors (24 versus 5.5 percent). These results 
may reflect conditions that are relatively uniformly 
good for cowbirds and consequently the variance in 
cowbird incidence is lower. In future analyses, we plan 
to subdivide the area we designated as ancestral into 
the core area where cowbird abundance is greater than 
thirty birds per route (i.e., the Dakotas, eastern Ne- 
braska and Kansas, western Missouri, and Iowa) and 
the surrounding zones where cowbird abundance is 
eleven to thirty birds per route (see Fig. 17.1). 

The analyses in this study accurately distinguished 
the cowbird’s ecological niche as a parasite. The CART 
results identified a different set of predictors for cow- 
birds and for their most frequently parasitized hosts, il- 
lustrating that the cowbird distribution is not simply 
the result of shared habitat preferences. Moreover, the 
CART analysis distinguished a different set of predic- 
tors for the parasitic cowbird and the guild of obligate 
grassland passerines that are its ancestral hosts 
(O’Connor et al 1999). An important reservation about 
the findings here is that the CART models we used, de- 
spite their sophistication, return only estimates of corre- 
lation. Therefore, our conclusions are subject to the 
normal caveats of correlation analysis, in particular that 
correlation does not ensure causation. Our emphasis on 
host availability and patch size arrives at the same con- 
clusions as those of earlier investigators, and since ours 
are based on analyses with very different biases than 
those in site-specific studies, this lends strength to all the 
studies. We distinguished the strong influence of climate 
and weather, largely overlooked in landscape-scales 
studies, and we described differences between the cow- 
bird’s ancestral and colonized range in the role of host 
abundance. The broad spatial extent of our analyses 
provides a robust overview of the correlates of the dis- 
tribution of the principal North American brood para- 
site that has not previously been available. 
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Appendix 


The fifty most frequently parasitized host species as 
identified by Friedmann (1963) by a review of pub- 
lished studies. No comparable reassessment of fre- 
quency of parasitism has been done, but these fifty 
hosts still appear representative (Rothstein, pers. 
comm.; DCH pers. obs). Friedmann designated a first 
(primary) group of seventeen hosts (“1,” more than 
one hundred records of parasitism), a second group of 
seventeen hosts (“2,” more than fifty records of para- 
sitism), and a third group of sixteen hosts (“3,” 
twenty-five to fifty records of parasitism). Twelve host 
species that Friedmann designated common hosts of 
great geographic availability are indicated with a “g.” 
Three host species indicated by an asterisk (*) are ob- 
ligate grassland species included in Table 17.2. 


Acadian flycatcher (Empidonax virescens): 3 
American goldfinch (Carduelis tristis): 2, g 
American redstart (Setophaga ruticilla): 1 
Bell’s vireo (Vireo bellii): 1 

Black-and-white warbler (Mniotilta varia): 3 
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Blue grosbeak (Guiraca caerulea): 3 

Blue-gray gnatcatcher (Polioptila caerulea): 3 
Blue-winged warbler (Vermivora pinus): 3 
Brown thrasher (Toxostoma rufum): 3 
Chestnut-sided warbler (Dendroica pensylvanica): 2. 
Chipping sparrow (Spizella passerina): 1, g 
Clay-colored sparrow (Spizella pallida): 2 
Common yellowthroat (Geothlypis trichas): 1, g 
Dickcissel (Spiza americana): 2, * 

Eastern bluebird (Sialia sialis): 3 

Eastern phoebe (Sayornis phoebe): 1 

Eastern towhee (Pipilo erythrophthalmus): 1, g 
Eastern wood-pewee (Contopus virens): 2 

Field sparrow (Spizella pusilla): 1 

Gray catbird (Dumetella carolinensis): 3 

Hermit thrush (Catharus guttatus): 3 

Indigo bunting (Passerina cyanea): 1 

Kentucky warbler (Oporornis formosus): 1 
Kirtland’s warbler (Dendroica kirtlandii): 2 
Lark sparrow (Chondestes grammacus): 3, g 
Louisiana waterthrush (Seiurus motacilla): 2 
Yellow-rumped warbler (Dendroica coronata): 2 
Northern cardinal (Cardinalis cardinalis): 2 


Ovenbird (Seiurus aurocapillus): 1 

Painted bunting (Passerina ciris): 2 

Prairie warbler (Dendroica discolor): 3 
Prothonotary warbler (Protonotaria citrea): 2 
Red-eyed vireo (Vireo olivaceus): 1, g 

Red-winged blackbird (Agelaius phoeniceus): 1, g 
Rose-breasted grosbeak (Pheucticus ludovicianus): 3 
Savannah sparrow (Passerculus sandwichensis): 3, * 
Scarlet tanager (Piranga olivacea): 2 
Song sparrow (Melospiza melodia): 1, g 
Swamp sparrow (Melospiza georgiana): 2 
Willow flycatcher (Empidonax traillii): 1 
Veery (Catharus fuscescens): 2 

Vesper sparrow (Pooecetes gramineus): 2, 
Warbling vireo (Vireo gilvus): 2, g 
White-eyed vireo (Vireo griseus): 3 
White-throated sparrow (Zonotrichia albicollis): 3 
Wood thrush (Hylocichla mustelina): 2 
Worm-eating warbler (Helmitheros vermivorus): 3 
Yellow warbler (Dendroica petechia): 1, g 
Yellow-breasted chat (Icteria virens): 1, g 
Yellow-throated vireo (Vireo flavifrons): 1, g 
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Biodiversity Conflict Analysis at 
Multiple Spatial Scales 


Christopher B. Cogan 


A s society places increasing demands on the envi- 
ronment, methods for predicting and minimizing 
environmental effects of population growth and land- 
use change are urgently needed. Much effort is now 
focused on developing spatial decision support sys- 
tems (SDSS) for regional planning and policy analysis 
(Jankowski et al. 1997). These efforts draw from both 
the social and natural sciences, and a key challenge is 
reconciling and integrating the disparate paradigms, 
semantics, measurements, and scales of analysis from 
these different traditions (Machlis 1992). In this chap- 
ter, I present a case study of Santa Cruz County, Cali- 
fornia, that looks at the integration of ecological and 
urban planning perspectives as a way to help local and 
county land-use planners evaluate the potential threats 
of land-use change to native biodiversity. To date, this 
critical aspect of the planning process is usually not at- 
tempted (Press et al. 1996; Crist et al. 2000). I ap- 
praised local ecological resources by considering re- 
gional distribution patterns, rarity, uniqueness, and 
restoration potential. Predicted patterns of future de- 
velopment from an urban growth model were com- 
pared to appraised biodiversity value to map relative 
risk of biodiversity loss and fragmentation. This ap- 
proach also served to identify areas where conflicts be- 
tween goals of urban expansion and biodiversity con- 
servation were likely to be greatest. 

Biodiversity is a documented metric for ecosystem 


health (Callicott and Mumford 1997) functioning at 
particular spatial scales (extent and grain) and organi- 
zational levels (Soulé 1991; Caughley 1994; Chapin et 
al. 1997; Soulé and Mills 1998). Despite this multi- 
scale attribute of biodiversity, there are few examples 
of analysis that bridge landscapes and habitats (e.g., 
500 hectares) to ecoregions or other large planning 
areas (e.g. 3 million hectares). Noteworthy publica- 
tions include Hansen et al. (1993), White et al. 
(19972), and Smallwood et al. (1998), as well as more 
theoretical discussions presented by Noss (1987), 
Noss et al. (1995), Norton and Ulanowicz (1992), and 
Heywood (1994). In this study, I focused on spatial 
patterns of species, wildlife habitat types, and areas of 
special ecological interest resolved at 1-100 hectares 
over an ecoregional extent. I also considered patterns 
of habitat fragmentation and suitability of areas for 
restoration. Biodiversity has several meanings, though, 
and for this study I focused on an analysis of ecosys- 
tem, species, and genetic diversity, which incorporated 
compositional and structural elements (Noss 1990a) 
within a county. My comparison of the county biodi- 
versity through time was the temporal equivalent of 
delta diversity (Whittaker 1977). The use of multiple 
criteria runs the risk of compounding uncertainty be- 
cause of incomplete and inaccurate data, but this is an 
acceptable trade-off for increasing relevance to the 
biodiversity analysis (Costanza 1992). I emphasize 
that the final products from this model are not 
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intended to stand alone. Rather, this SDSS represents 
one component of biodiversity that should be sup- 
ported by finer, and in some cases coarser, scales of 
species and habitat analysis. 

Habitat loss or fragmentation due to urban devel- 
opment is only one of many anthropogenic impacts on 
biodiversity, but it now ranks among the principal 
causes of species endangerment in the United States 
(Dobson et al. 1997; Vitousek et al. 1997; Wilcove et 
al. 1998). A variety of methods are available for pre- 
dicting urban growth—I chose a relatively simple, 
general model that could be applied over counties or 
larger areas using 10—50-year planning horizons and 
generally available data. Because many land-use deci- 
sions are made at the county level, this county analysis 
promotes conservation opportunities (Press et al. 
1996). This study also used ecoregional boundaries to 
ecologically define the extent of the study area. Al- 
though these spatial and temporal scales will not 
apply to all situations, they will be useful in many 
contexts of biodiversity protection and will provide a 
starting point for the method I now present. 

This Santa Cruz County, California, case study 
demonstrates how biodiversity value can be assessed 
as a spatially explicit property and seeks to predict 
where these values will decrease in the future. 


Development of Biodiversity 
Analysis Submodels 


The biodiversity value of a site was based on four cri- 
teria, or submodels. Ecoregional analysis described 
the county wildlife habitat types which contributed 
disproportionately to the regional distribution; species 
babitat models predicted composition of terrestrial 
vertebrate species within landscapes; restoration op- 
portunities were identified; and special features indi- 
cated presence of special natural landscapes. 

Threat or future conflict were assessed based on 
two additional submodels: habitat-specific conversion 
to urban land use over the county subregion and land- 
scape pattern analysis based on changes in patch cohe- 
sion (Schumaker 1996). 

By maintaining each submodel independently, I 
promoted flexibility in the model as a whole, enabled 
open dialog between model users and stakeholders, 


and clarified results. After completion of each sub- 
model, my results were best interpreted individually, 
though under some circumstances they could also 
have been additively combined as a weighted linear 
combination (Eastman et al. 1995) or in an analytic 
hierarchy (Saaty 1980; Anselin et al. 1989; Saaty and 
Vargas 1994). 

Whereas the relative contribution of each submodel 
to final decision making was context dependent and 
adjustable, it was important to include each in the 
analysis. These four biologic indicators and the land- 
scape pattern analysis were intended to represent a 
collection of biodiversity *principal components," 
each representing its own orthogonal metric to the 
overall representation. This array of components can 
best be visualized using a process flow chart (Fig. 
18.1) to provide an overview of the submodels and 
their interactions in biodiversity analysis. 


Ecoregional Analysis Submodel 


The first part of the biologic assessment, ecoregional 
analysis, was designed to evaluate coarse-scale 
processes that function across large area extents (mil- 
lions of hectares) and decadal periods. Using ecore- 
gions defined by physiographic and biologic consis- 
tency (Hickman 1993), I combined information on 
land management, land cover, and habitat area into a 
single quantitative assessment. The land-cover data 
used the wildlife habitat relationships (WHR) vegeta- 
tion classification system (Airola 1988) from the Cali- 
fornia Gap Analysis Project (GAP), though the same 
types of data are common in other areas. This sub- 
model was based on five explicit assumptions: 


1. Habitat types with more area are more advanta- 
geous to species compared to the same habitat type 
with less area. 

2. Land areas under private ownership are more 
likely to undergo development or habitat loss com- 
pared to certain publicly owned parcels such as 
wilderness areas, wildlife refuges, and parks. 

3. Areas of a given habitat type may be rare in a 
county, but if that same type is common in the 
ecoregion, these areas may not be as critical for 
conservation as suggested by an analysis based on 
county data alone. 
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Figure 18.1. Components of the biodiversity model and their interactions. Boxes containing italic text 
represent input data; central boxes containing dashed lines represent the four biodiversity submod- 
els. Output data are depicted at the bottom of the figure. Double-headed arrows represent possible 
feedback mechanisms. Components of the model functioning at relatively coarse spatial scales are 
grouped to the left side and finer scale components are located to the right. The analytic hierarchy 
process and summary are optional. 


4. Areas of a given habitat may be common in a equivalent to similar assemblages in other ecore- 


county, but if that same type is regionally rare, 
these areas may be of more concern for conserva- 
tion than county data alone suggest. 

. Habitats and their associated species represent a 
unique community assemblage within an ecoregion. 
This complex of biotic and abiotic interactions is 
only partially documented and cannot be considered 


gions. See Walker (1992) for further discussions on 


inter-ecoregional community turnover. 


Based on these five assumptions, I assessed the habi- 


tat types found in the study area. I began by calculating 
the ratio of land areas between the county and the 
ecoregion, summarized by habitat type. Privately held 
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land areas were down-weighted compared to most 
classes of public lands. This down-weighting was ac- 
complished by counting only a user-selectable fraction 
(e.g., 75 percent) of each privately held habitat patch. 
Expected values were derived from the ratio of county 
size to ecoregion size, and habitat types that were 
within 20 percent of their expected areas were neutrally 
weighted. Habitats that exceeded the expected area by 
more than 20 percent (those concentrated in the 
county) were categorized as high ecoregional impor- 
tance and those with less than 20 percent of expected 
area were categorized as low ecoregional importance. 
This approach works well if the ecoregion is relatively 
intact; other areas with extensive land-use alteration 
could benefit from historic ecoregion descriptors for the 
county comparison. The quantitative output from this 
submodel was a numerical score for each habitat type 
in the county. These scores were then aggregated to de- 
pict low-, neutral-, and high-importance areas, which 
could be graphically represented in a three-color map of 
the county (see Fig. 18.2 in color section). The results of 
the ecoregional analysis were not intended to be ab- 
solute. Adjustable parameters such as optional weight- 
ings for land ownership allowed the model user to fine- 
tune the model for a given region or particular 
application. Each biodiversity submodel was designed 
to capture specific information, leaving other details to 
submodels that were more appropriate. For example, 
the ecoregional analysis would not identify a habitat 
that had been decimated throughout the ecoregion; 
however, in this situation, the special-features submodel 
(described below) came into play. 

Example of Ecoregional Analysis: Redwood forest 
habitat in Santa Cruz County. Most of the redwoods 
(Sequoia sempervirens) 1n the central west ecoregion 
are contained within Santa Cruz County. Conse- 
quently, this habitat type has a greater importance in 
the county than would otherwise be considered using 
county data in isolation. The high importance areas 
identified by the ecoregional analysis (Fig. 18.2) were 
strongly influenced by the restricted distribution of 
redwood forest. 


Species Habitat Submodel 


The second part of the biologic assessment utilized 
species habitat submodels. Using California GAP data 


models, I identified habitat areas as landscape features 
and associated each with a list of vertebrate species 
(Scott et al. 1993; Hollander et al. 1994; Davis et al. 
1998). The GAP models were designed to provide as- 
sessments of conservation status for native vertebrate 
species and natural land cover types (Scott and Jen- 
nings 1997). Areas of nonurban, nonagricultural land 
cover were considered important wildlife habitats in 
this analysis. For the species habitat submodels, I as- 
sume habitat types within an ecoregion are ecologi- 
cally equivalent. The model output is a species list for 
each landscape unit in the county. 

Example of Species-Habitat Relationships: Black 
salamander. Using a combination of species range and 
habitat suitability, the black salamander (Aneides 
flavipunctatus) is predicted to occur in two regions 
within California (California GAP data, Davis et al. 
1998). One of these areas is in Santa Cruz County. By 
analyzing the specific landscape units within the 
county that are highly suitable for the species, species 
as well as spatially explicit landscape units could be 
used to analyze biodiversity. 


Restoration Opportunities Submodel 


Restoration opportunities, the third component of this 
four-part biodiversity model, defines areas of potential 
restoration as areas capable of increased biodiversity 
value with feasible levels of management change. In 
many cases, urban areas have low biodiversity value, 
and it is often not practical to improve them. Sites com- 
posed of invasive species monocultures, disrupted wet- 
lands, and timber harvest areas are all candidates for 
restoration (Dobson et al. 1997). Farmlands may also 
have low value, though land-use practices can often be 
modified to improve habitat with little or no financial 
impact to the farmer (Ratti and Scott 1991). Farmlands 
are likely to have an increasing role in conservation 
efforts (Pienkowski et al. 1996), and the role of agricul- 
ture in this biodiversity analysis was intentionally in- 
cluded to avoid interpreting biodiversity as a pheno- 
menon limited to nature reserves. As a preliminary 
approach, this submodel identified all agricultural areas 
in the county as restoration opportunities, with final 
output in a map format. | 
Example of Restoration Opportunities: Agricul- 
ture. Minor modifications can be made to agricultural 
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areas that can potentially increase biodiversity value 
(Pienkowski et al. 1996). Examples include larger 
hedgerows between fields (EIP-Associates 1997) and 
changes in pesticide application techniques. 


County Special Features Submodel 


The fourth biodiversity submodel identified and 
mapped special features found within county. Exam- 
ples of well-known special features in Santa Cruz 
County include unique soils and old-growth red- 
wood forests. These data were obtained through in- 
terviews with local experts, who often have remark- 
able insights and information about such features. 
Complementing these data is a geographic informa- 
tion system (GIS) database, maintained by the Cali- 
fornia Department of Fish and Game, Natural Her- 
itage Division, on rare plant and animal habitats 
(Hoshovsky 1988). This portion of the analysis typi- 
cally functioned at a fine spatial resolution using in- 
formation about present-day known critical areas. 
Although these types of data will usually be some- 
what biased and incomplete, any knowledge of cur- 
rent problem areas is an extremely valuable compo- 
nent of biodiversity analysis. The purpose of this 
submodel was to identify and locate these important 
components of biodiversity. Each special area could 
then be evaluated for existing or proposed protection 
status, directly modifying the results of the urban 
growth model. Without protection status, special 
areas are subject to the same threats as other areas. 
Traditionally, the various levels of protection status 
for a species or habitat have been an important com- 
ponent of conservation strategy. Although the meth- 
ods described here took advantage of this informa- 
tion, the unprotected special areas and currently 
unidentified species of concern were likely to be 
equally important in the fifty-year outlook being 
considered. Output from this submodel was a GIS 
database identifying special features areas for protec- 
tive-status consideration. 

Example of County Special Features: The Sand- 
hills soil type in Santa Cruz County. Sandhills soils 
support several local species of plants that are cur- 
rently threatened (Marangio and Morgan 1987). 
This soil type is an indicator for important biodiver- 
sity areas that are currently (and not just predicted to 


be) stressed by urbanization. This submodel flagged 
these areas as “special” without attempting to rank 
them by size, shape, or other landscape-pattern 
metrics. 


Landscape Pattern Analysis 


Landscape pattern analysis is a submodel that com- 
plements the four biodiversity submodels for assess- 
ing one measure of habitat stress or degradation fol- 
lowing land-use change. As patches of habitat 
change size, shape, and adjacency, habitat quality 
can also change. Landscape factors that affect habi- 
tat quality can be assessed at three stages within the 
biodiversity analysis: (1) assessment of ecoregional 
importance, (2) assessment of habitats impacted by 
urban growth, and (3) habitat evaluation for individ- 
ual vertebrate species (Fig. 18.1). For this case study, 
the second stage, evaluating predicted urban growth 
to assess habitat alteration, was used. Although di- 
rect landscape measurements such as fragmentation, 
contagion, and cohesion are possible, the resulting 
values are difficult to interpret ecologically. My solu- 
tion was to apply a habitat cohesion calculation 
twice—before and after the urban-growth predic- 
tions were calculated. In this way, those areas shown 
to undergo the most radical landscape changes could 
be flagged for further analysis at the species or habi- 
tat-specific level. Submodel output was the ratio of 
current and predicted landscape patch cohesion 
scores associated with each habitat type. 

Example of Landscape Pattern Analysis: Patch co- 
hesion. By comparing this area-weighted metric of 
each landscape unit before and after applying the 
urban growth model, habitat types with the greatest 
change could be flagged. I used patch cohesion (Schu- 
maker 1996) modified as a vector index of habitat 
pattern calculated as: 


where: 


E œ (perimeter : area) + shape 


index 

max (É) case of many small 
patches 

min (&) case of single large patch 


P - perimeter of habitat patch 


A = area of habitat patch 
PC = patch cohesion, range (0,1) 
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where patch cohesion (PC) was calculated for each 
habitat type, and comparisons of pre-growth to post- 
growth models were calculated as PCpre-growth / PCpost- 
growth: The limit values of € were calculated assuming 
two extreme cases. At maximum €, the habitat type 
consisted of many small circular patches with individ- 
ual patch area constrained by the minimum map unit 
(MMU). At minimum €, the index was calculated as- 
suming a single large circle of habitat. In both cases, 
total area of each habitat was set equal to the actual 
area of that habitat type in the dataset. This submodel 
allowed ranking of habitat types for degree of cohe- 
sion change, identifying imperiled habitats that might 
be overlooked by area calculations alone. Typically, 
this analysis would identify vegetation areas that had 
become fragmented or otherwise eroded into linear 
shapes with reduced area-to-perimeter ratios. Once 
identified, heavily altered habitats were prioritized for 
site-specific analysis. 


Development of Urban Growth 
Predictive Models 


The biodiversity and landscape pattern submodels 
presented an indicator of environmental condition 
over a countywide area. This information was based 
on current conditions and provided spatially explicit 
ecological data, which was used as a starting point for 
predictive modeling. To examine the effects of increas- 
ing urbanization over the next forty years, an urban 
growth model was applied to my present-day biodi- 
versity analysis. Urban models have recently been de- 
veloped (Makse et al. 1995; Couclelis 1997; White et 
al. 1997b; Batty 1998; Landis and Zhang 1998; 
Makse et al. 1998), which incorporate a broad range 
of variables and spatial dependencies; however, by fol- 
lowing the approach taken by Clarke and Gaydos 
(1998), I arrived at a model that was transportable 
and largely independent of spatial grain or extent. 

A cellular automata approach to modeling future 
growth was used, incorporating likelihood of urban 
sprawl, likelihood of entirely new communities being 
developed, and likelihood of urban areas expanding 
along transportation corridors. Model inputs included 
spatial data on urban areas, roads, land ownership, 
and topography. Spatial data were resampled to 100- 


meter grids, though the model was flexible enough to 
function at other levels of resolution. I also incorpo- 
rated user-adjustable variables for growth rate, 
growth type, and time period. Figure 18.3 (see color 
section) represents output from the urban model 
showing areas likely to become urbanized by the user- 
specified target year. These areas of likely future ur- 
banization are a key component in the assessment of 
future biodiversity change. 


Submodel Combination for Composite Valuation 


Combining the results of the submodels tended to 
mask the relationships between different aspects of 
biodiversity and environmental threats. In some cases, 
however, it may be necessary to combine data, for in- 
stance, when comparing multiple scenarios, perform- 
ing sensitivity analysis of the submodels, or creating 
public-presentation summaries. Using the landscape 
unit of analysis, the combined data can represent a 
composite valuation of future biodiversity risk for the 
county area. À simple way to integrate the submodel 
results was to keep these landscape units as a single 
common currency, maintaining the option of biodiver- 
sity evaluation by species using the WHR models 
when needed. This approach had the advantage of 
simplicity, which in some cases might outweigh the 
disadvantages of data generalization and the uncer- 
tainties of the WHR models. 

When combining submodel results, several types of 
weightings are possible, achieved by adjusting the rel- 
ative input of each submodel. Since the relative 
weighting of each component will involve some degree 
of uncertainty and will often involve multiple decision 
makers, it may be useful to use the Analytic Hierarchy 
Process (AHP) as a decision analysis tool (Saaty 
1994). AHP assists multiple users in building a con- 
sensus of component weights and maintains explicit 
weightings both within and between submodels. The 
weightings for each submodel are intended to be user- 
adjustable. For example, some model users may wish 
to de-emphasize restoration potential whereas other 
users may prefer to make it a major factor in the 
analysis. AHP is included in the process flow chart 
(Fig. 18.1) but is optional. 


18. Biodiversity Conflict Analysis at Multiple Spatial Scales 235 


An Analysis of 
Post-urbanization Biodiversity 


My analysis of post-urbanization biodiversity is pre- 
.sented as a series of discrete measures. The results are 
reported in terms of ecoregionally important habitats, 
compromised within-county habitats, impacted verte- 
brate species, loss of high-restoration-potential areas, 
ecologically critical areas of risk, and landscape- 
pattern changes. These results are spatially explicit 
and are best presented with the aid of interpretive 
maps. 

Ecoregional analysis addressed only the coarse- 
grain perspective of biodiversity. In this assessment 
three of the twelve dominant WHR habitat types in 
Santa Cruz County were more important than a 
county perspective alone would show. These habitat 
types are redwood, montane hardwood conifer, and 
Douglas-fir (Pseudotsuga menziesii) forest. These 
three habitat types represented the high-priority con- 
servation targets identified by this submodel, without 
considering the effects of predicted urbanization. 

Habitat loss from urban growth is a second discrete 
measure of biodiversity impact. Here, my cellular au- 
tomata urban growth model predicted an 11 percent 
increase in urbanized land cover in Santa Cruz County 
over the next fifty years. Much of the new growth was 
predicted to be in the form of contiguous expansion 
rather than isolated new settlement (Fig. 18.3). Habi- 
tat losses resulting from this growth included a 13 per- 
cent reduction from the current extent of redwood 
forests, a 10 percent reduction of montaine hardwood 
chaparral, and a 4 percent loss of Douglas-fir forests. 
Santa Cruz County has a long history of human- 
induced landscape alterations. Because these transi- 
tions are largely undocumented, historic reductions to 
date cannot be calculated. My results from this sub- 
model were therefore based on habitat loss calculated 
from 1990 levels, which was an arbitrary but logical 
reference point. Habitats that were already seriously 
compromised before 1990 should be identified using 
either the special features submodel or the restoration 
opportunities submodel. 

Species impacts are a third measure of biodiversity 
loss. Using the California GAP species habitat models, 
vertebrate species most strongly affected by my future 


urban-growth scenario included the hermit warbler 
(Dendroica occidentalis), the golden-crowned kinglet 
(Regulus satrapa), and the American shrew-mole 
(Neurotrichus gibbsii). Sixteen avian, two mammal, 
three amphibian, and two reptile species were pre- 
dicted to lose at least 10 percent of their present-day 
habitat in the Jepson Central West ecoregion (Table 
18.1). 

Areas of potential restoration were evaluated for 
losses following urban growth. In my model, 30 per- 
cent of the new growth forecast was predicted to dis- 
place cropland areas, equivalent to a 17 percent loss 
of county croplands by the year 2040. Other possible 
restoration areas such as urban riparian zones, im- 
pacted coastal zones, and timber harvest areas were 
not included in this case study. 

The county special features analysis identified sev- 
eral types of fine-grain special features in Santa Cruz 
County. Areas of special concern included unique veg- 
etation complexes associated with sandhills soils, old 
growth redwood forest, and regions identified as Sig- 
nificant Natural Areas by the California Department 
of Fish and Game. In this preliminary analysis, I did 
not assign protective status to special features areas 
but instead permitted the growth model to forecast ur- 
banization in these regions. In many instances, the 
special features areas I identified were not at the time 
protected by state or federal regulation. Allowing a 
first-iteration growth model to select from these habi- 
tats is useful for identifying locations of possible fu- 
ture conflict. 

By calculating the patch cohesion landscape metric 
of each habitat type before and after predicted 
growth, I was able to rank order the habitats by mag- 
nitude of landscape alteration (Table 18.2). Although 
absolute cohesion value is not directly meaningful (see 
discussion), it was useful to note the greatest land- 
scape pattern effects in patches of coastal scrub, 
coastal oak woodland, and redwoods. 


Discussion 


By analyzing the major components of biodiversity, I 
sought to clarify the assumptions, strengths, and limi- 
tations inherent in such a complex analysis. This de- 
construction also permitted me to target the most 
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TABLE 18.1. 


Species most affected by future urban growth in Santa Cruz County, California. Of 260 vertebrate species with 
habitat in the Jepson central west ecoregion, thirty-nine species are predicted to lose more than 5 percent of 
their habitat, and the twenty-three species listed here are predicted to lose more than 10 percent of their 
habitat. Area impact refers to the percentage of the ecoregional habitat predicted for urban conversion. Listed 
status of "CA SSC" refers to California Species of Special Concern, as described by the California Department 


of Fish and Game (January 1999). 


Common name 


Scientific name 


Area impact (%) 


Listed status 


hermit warbler 
golden-crowned kinglet 
shrew-mole 

Vaux's swift 

pine siskin 
MacGillivray’s warbler 
olive-sided flycatcher 
Trowbridge’s shrew 
black salamander 
Pacific giant salamander 
northern alligator lizard 
hermit thrush 
sharp-shinned hawk 
rubber boa 

pileated woodpecker 
Cassin’s vireo 
white-crowned sparrow 
winter wren 


chestnut-backed chickadee 


American goldfinch 
brown creeper 
flammulated owl 


California slender salamander 


Dendroica occidentalis 40 none 
Regulus satrapa 39 none 
Neurotrichus gibbsii 38 none 
Chaetura vauxi 38 CA SSC 
Carduelis pinus 38 none 
Oporornis tolmiei 38 none 
Contopus cooperi Sy none 
Sorex trowbridgii 35 none 
Aneides flavipunctatus 35 none 
Dicamptodon ensatus 34 none 
Elgaria coerulea 28 none 
Catharus guttatus 27 none 
Accipiter striatus 24 CA SSC 
Charina bottae 24 none 
Dryocopus pileatus 19 none 
Vireo cassinii 18 none 
Zonotrichia leucophrys 17 none 
Troglodytes troglodytes 15 none 
Poecile rufescens 12 none 
Carduelis tristis 11 none 
Certhia americana ‘lat none 
Otus flammeolus alib none 
Batrachoseps attenuatus qid! CA threat 


objective data sets and clarify where scale dependen- 
cies and assumptions regarding the input data were 
present. The scale dependencies in this biodiversity 
analysis were revealed when one or more of the scale- 
specific submodels were judged by the model user to 
be critical in the analysis. As an example, some sites or 
species may have had critical biodiversity value due 
largely to local (small area) effects. In this case study, 
redwood habitats were of critical importance in every 
submodel, whereas areas of coastal oak woodland and 
coastal scrub were emphasized solely through the 
landscape pattern analysis. 

In the ecoregional submodel, data inputs were spa- 
tially coarse grained as is appropriate in studies of 
larger areas. Thematically, I also conducted this analy- 


sis at a coarse-grain scale using a generalized habitat 
classification to describe the county in an ecological 
context. The ecoregional analysis included an inten- 
tional bias for land management, incorporating the 
judgment of the model user to set the urban conver- 
sion probability of public versus private lands. This 
was done by counting only a user-selectable fraction 
of habitats where they were privately held. A more ex- 
plicit approach would have extended the urban 
growth model from the county to the entire ecoregion; 
however, the benefits gained from this refinement may 
not have justified the added complexity in the model. 

The county special features data represented ele- 
ments that were detectable only when using a fine- 
grain approach to biodiversity analysis. These areas 
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TABLE 18.2. 


Landscape analysis results. Major habitat types in Santa Cruz 
County, California, listed in order of patch cohesion alteration. 
Cropland and urban area effects are also included. 


Habitat type Rank order 


coastal scrub most altered 
coastal oak woodland 

redwood 

montane hardwood 

chamise-redshank chaparral 

annual grassland 

cropland 


urban least altered 


were the current habitats of concern in the county, 
often representing fragile remnants of once-larger or 
more-numerous landscape features. This resolution of 
analysis and associated small plots may be particularly 
important for preserving small populations of plants 
(Lesica and Allendorf 1992). The special features sub- 
model also provided data concerning what were typi- 
cally the most studied and understood environmental 
issues in an area and were valuable examples of biodi- 
versity loss in the area. This portion of the data, when 
combined with larger-extent biodiversity metrics, al- 
lowed detailed study results about single species to be 
leveraged or combined with other data for a more 
complete understanding of regional biodiversity. In 
some instances, special areas in the county will be ad- 
equately protected by existing state or federal law. At 
other times, a special area may still be vulnerable, 
even with state or federal protection, or it may not 
possess such protection. Because of this political and 
legal uncertainty, realistic assessment of impacts on 
special areas is best deferred until county planners can 
modify the initial iteration of the urban-growth 
Scenario. 

The species habitat submodels can function at a va- 
riety of spatial resolutions, though the California GAP 
vertebrate models are largely based upon the WHR 
models developed for a statewide area. WHR has not 
been immune to criticism (Block et al. 1994), though 
validation of these types of models is a challenging 
problem in itself (Marcot et al. 1983; Karl et al., 
Chapter 51). The California GAP models offer the re- 


finement of updated and improved habitat descrip- 
tions; however, this did not permit the models to re- 
place field surveys of local interest areas. It may have 
been reasonable to fine-tune the vertebrate models 
using local landcover data; however, this would still 
have yielded. only a first-draft list of species needing 
verification by field survey. 

The landscape pattern analysis submodel addressed 
the issue that patches of habitat are not simple sum- 
maries of their areas, notably that patch shape is an 
important element in any biodiversity stressor model. 
Landscape pattern metrics are commonplace in the re- 
search literature (Pearson et al. 1995; Schumaker 
1996; O'Neill et al. 1997; Wickham et al. 1999) even 
if they're little used in the planning process. Although 
there is not universal agreement on what the many 
landscape metrics mean, it is useful to identify the ex- 
treme changes in area/perimeter relationships follow- 
ing habitat modification. The identified habitat types 
can then be assessed individually to determine if the 
forecast change will be biologically important. By cal- 
culating patch cohesion, which is a variant on the 
perimeter-to-area ratio, I compared habitat patches 
before and after the application of the urban growth 
model. Where projected urbanization most radically 
altered habitat, as in the case of coastal sage scrub, 
this landscape analysis was useful to flag specific habi- 
tat types for possible future conflicts. Future improve- 
ments in habitat map grain and WHR reliability will 
improve this submodel. However, as fine-grain habitat 
data becomes available, issues of species perception of 
habitat edges will also become more critical. 

I used an urban growth model to address the issue 
of stressors to the environment. Buildout plans or 
other spatially explicit land-use scenarios could also 
function with my biodiversity model. The approach I 
have adopted for modeling urban growth is best used 
as an iterative model, letting the growth-model results 
be assessed by knowledgeable county planners who 
guide the tuning of model parameters in a series of 
model runs. By restricting input data to spatially ex- 
plicit physical features (slope, roads, protected areas, 
etc.) and avoiding the less-tenable political and socioe- 
conomic dependencies, I optimized the growth model 
for a long-term (20—50-year) forecast. Other models 
incorporating data on politically guided development 
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patterns and population growth may be more appro- 
priate for short-term (5-20-year) forecasts. For the 
purposes of this analysis, I have used the urban 
growth scenario generated by my model without the 
benefit of iterative tuning by county-planning per- 
sonal. This additional step would implicitly incorpo- 
rate some of the near-term socioeconomic trends and 
could have improved the utility of my results. Com- 
parison of multiple urbanization scenarios and sensi- 
tivity analysis is also a logical approach to explore. 

Starting with the California GAP vertebrate mod- 
els, the habitat sites encroached upon by forecast 
urban growth were considered “compromised” re- 
gardless of the actual percentage of the habitat patch 
converted to urban land use. This is a conservative ap- 
proach, providing a list of species worthy of further 
analysis to determine if continued habitat pressures 
could indeed disrupt the species. Few of the species I 
identified were particularly threatened at the time of 
the study, but as such, they were also not always con- 
sidered in the planning process. Many of the species I 
identified for priority treatment during the next forty 
years were actually species that marginally included 
the Jepson Central West ecoregion in the southern 
fringe of their habitat. Species such as the flammulated 
owl (Otus flammeolus) may be doing quite well in the 
Sierra Nevada, but it may not be reasonable to assume 
that the species functions in exactly the same ecologi- 
cal niche across different ecoregions. Other species 
such as the California giant salamander (D. tenebro- 
sus) do not have continuous distributions connecting 
to another ecoregion, suggesting possible differentia- 
tion within the species. 

The loss of farmlands to urban development is an 
issue of general importance, but I limited my discus- 
sion here to farmlands as they relate to biodiversity. 
From this perspective, farmlands are areas that do not 
currently support a diverse assemblage of native 
species; however, especially when compared to other 
more-destructive types of land-use practices, farm- 
lands are resource areas that could be adapted for 
conservation management. In my model, the forecast 
17 percent loss of agricultural areas is largely due to 
adjacency of existing urbanized areas, ready access to 
road transportation, and flat, buildable terrain. Other 
areas could also have been appropriately considered 


to have restoration value, such as degraded wetlands 
or any area disrupted to the extent that biodiversity 
function is compromised. 


Conclusion 


My analysis has quantified potential future biodiver- 
sity loss by reporting the habitat types, vertebrate 
species, and restoration areas most likely to be ad- 
versely impacted in coming decades. The results pre- 
dict potential biodiversity degradation in specific areas 
of coastal scrub, redwood forest, coastal oak wood- 
lands, montane hardwoods, and Douglas-fir. I also de- 
scribed potential conflicts in terms of vertebrate 
species, and raised additional issues of agricultural 
losses. Each submodel reported spatially explicit re- 
sults, which when presented as map offers an easily 
comprehendible form of data presentation (e.g., Figs. 
18.2 and 18.3). In several cases—notably for redwood 
habitats—the submodels provided converging evi- 
dence of future biodiversity stress. 

Ideally, urban planners, land managers, and biolo- 
gists will analyze urban-growth scenarios for pre- 
dicted impacts on biodiversity and use these models 
for iterative reevaluation of land-use planning and re- 
serve design. Iterations are powerful techniques in bio- 
diversity analysis, allowing the model user to change 
assumptions and parameters such as ecoregional land 
ownership, new road placement, and urban growth 
type. With each iteration, biodiversity losses can be 
minimized and a more thorough understanding of 
county ecosystem health can be achieved. My case 
study of Santa Cruz County, California, was designed 
to be a general model, allowing it to be applied in 
other areas. 

Each of the submodels function with data that were 
gathered at particular spatial and temporal scales. 
Since biodiversity is itself a multiscale phenomenon, I 
have strived to design an analysis approach that cap- 
tures this variation. In some cases, portions of the 
model will not reveal threats to biodiversity, though 
this may be difficult to determine before compiling the 
data. By analyzing the fundamental components of 
biodiversity and reporting the results as discrete sub- 
models, I can determine the scale at which the domi- 
nant stressors are functioning and ensure that they are 
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not overlooked. At times, these multiscale processes 
may be nonintuitive. As an example, a study looking 
at local-scale dynamics may not reveal background 
effects from regional-scale stressors. Altering the bio- 
diversity submodels can also reveal hidden scale de- 
pendencies. For example, scale sensitivity can be de- 
termined by generalizing the ecoregional habitat map. 
In other cases, the urban growth models can be run 
with a larger grid size, and the area of analysis can be 
enlarged or reduced to look for grain- and extent-de- 
pendent effects. Regardless of the choice of map 
scales, some of the biodiversity indicators are inher- 
ently scale dependent. Wildlife habitat models func- 
tion best at larger grain sizes, urban growth models 
are not appropriate at the submeter level, and areas of 
special concern are not subject to area or landscape 
metrics. Because of these multiscale properties, I do 
not attempt to sum the submodels in terms of land 
area or species counts but instead consider levels of 
impact summarized over landscape units as the com- 
mon theme. 

In each submodel, the resolution of data is also re- 
lated to accuracy of data (see Fielding, Chapter 21). In 
some cases, actual accuracy is not extremely impor- 
tant, whereas in others it is a critical issue. The urban 
growth model is predictive, and measures of “hind- 
casting" will not yield an accuracy assessment of fu- 
ture model performance. More importantly, such 
models provide county planners with a warning about 
probable future conflict areas and are designed for it- 
erative tuning and multiple-scenario comparisons. Ide- 
ally, actual future development patterns will be proac- 
tively guided away from predicted scenarios. In other 
submodels, such as the species habitat models, the 
spatial resolution of the data is important for model 
accuracy. GAP WHR models are being evaluated for 
accuracy (Edwards et al. 1996; Boone and Krohn 
1999); see also Karl et al. and Hepinstall et al. (Chap- 
ters 51 and 53), and Garrison and Lupo (Chapter 30). 
Initial assessment results show encouraging success, 


although much room remains for improvement. This 
issue goes beyond the need for finer-scale habitat maps 
and corresponding WHR models, though these meas- 
ures would certainly benefit the analysis presented 
here. A more difficult task will be to analyze biodiver- 
sity in a manner that includes a more sophisticated 
species-specific habitat analysis, incorporating more of 
the environmental needs of each species into each 
submodel. 

My case study includes several types of ecological 
analysis, and it complements other approaches. 
Species populations can be studied using techniques 
such as population viability analysis (PVA), which is 
based upon measures of critical values like environ- 
mental and demographic stochasticity (Boyce 1992). 
Habitats can also be assessed by their component 
parts, and as shown by this study, ecosystem biodiver- 
sity and ecosystem health can be similarly assessed. As 
population level models are improved and refined, we 
will be able to incorporate PVA-type data with habitat 
viability analysis (HVA) and ecosystem viability analy- 
sis (EVA) to gain a more complete and consistent un- 
derstanding of ecosystem health across multiple scales. 
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A Collaborative Approach in Adaptive 
Management at a Large-landscape Scale 
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David Solis, Mike Gertsch, Brian Woodbridge, 
Adrienne Wright, Greg Goldsmith, and Chirre Keckler 


E. managers are increasingly confronted 
with the problem of how to make informed deci- 
sions that rely on new information or tools. This is es- 
pecially true as we attempt to shift land management 
practices to a regional or ecosystem perspective. Fed- 
eral agencies and others have variously defined ecosys- 
tem management, which has led to many debates over 
the concept (Haeuber 1996). Recently, increasing con- 
flicts over resource management have resulted in a 
number of area ecosystem planning efforts (e.g., Great 
Lakes-St. Lawrence River Basin, Interior Columbia 
Basin Ecosystem, Everglades-South Florida, Sierra 
Nevada Ecosystem Project, and Southern California 
Natural Communities [Johnson et al. 1999]). There 
have also been many well-publicized efforts to develop 
comprehensive landscape plans for managing individ- 
ual species, focusing mainly on those that are federally 
listed under the Endangered Species Act (ESA) such as 
the northern spotted owl (Strix occidentalis caurina; 
USDA/USDI 1994a,b), California spotted owl (Strix 
occidentalis occidentalis; Verner et al. 1992), and griz- 
zly bear (Ursus arctos; Burroughs and Clark 1995). 
For a variety of reasons, not all of these planning ef- 
forts have been successful, but all were costly. 
Adaptive management has been emerging as a cen- 
tral theme in the management of natural resources on 
federal lands in the United States, particularly as it ap- 
plies to the concept of ecosystem management (Wal- 


ters and Holling 1990). A true adaptive process in- 
volves a rigorous scientific and repeatable approach to 
resource planning (Holling 1978; Walters 1986). 
There is a distinction between active and passive adap- 
tive management where in the former there is an ac- 
tive pursuit of information as an objective of the deci- 
sion-making process (Nichols et al. 1995). Walters 
(1997) stated that *adaptive management should 
begin with a concerted effort to integrate existing in- 
terdisciplinary experience and scientific information 
into dynamic models that attempt to make predictions 
about the impacts of alternative policies." He empha- 
sized that this serves three functions: (1) problem clar- 
ification and enhanced communication among scien- 
tists, managers, and other stakeholders; (2) policy 
screening to eliminate options that are least likely to 
succeed; and (3) identification of key knowledge gaps. 

Adaptive management is complex and conceptual, 
and the methods are ambiguous and rarely or only 
partially applied (Lee 1993; Gunderson et al. 1995; 
Walters 1997; Carpenter 1998; Rogers 1998). Failures 
of traditional management that did not use an adap- 
tive approach have occurred most obviously with 
problems in large complex ecosystems (Johnson 
1999b). Many efforts to implement large-area man- 
agement plans (including early attempts on the north- 
ern spotted owl, hereafter spotted owl, or owl) failed 
for a variety of reasons. Managers are often limited 
by one or more of the following: a lack of data, 
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inadequate knowledge or understanding of available 
data, lack of useful methods to analyze and interpret 
data in a meaningful way, and/or lack of effective 
communication with researchers (Arnett and Salla- 
banks 1998). Modeling has seldom been a part of nat- 
ural resource management efforts where model predic- 
tions could be tested and used to enhance knowledge 
and improve management (Conroy 1993). In some 
cases, there may be considerable information, but it 
may not be useful for informing decision makers. 
Often there are insufficient resources to use the data, 
especially in a timely manner. Other problems such as 
political pressures or resistance to change within fed- 
eral agencies have also contributed. A few efforts, 
such as those on North American waterfowl (Nichols 
et al. 1995), are succeeding because the primary con- 
stituents agreed that a problem existed, that specific 
information was needed to address the issue (and was 
sought), and how the resulting data would be used to 
modify management plans (Johnson and Williams 
1999): 

Many recent books and papers discuss problems 
and lessons learned from attempts at large-scale re- 
source management, including those of manager- 
scientist interactions (e.g., Marzluff and Sallabanks 
1998; Bormann et al. 1999; Carey et al. 1999; Con- 
cannon et al. 1999), but few described specific steps 
and results that led to successful implementation of a 
management plan. Although some tools and particu- 
larly some of the lessons that were learned from exer- 
cises such as those cited above were pertinent to parts 
of our work, comprehensive guidance or consistent 
methods had not emerged. This is not unusual given 
the evolving state of large-landscape-scale assessments 
under an adaptive management construct. 

We were confronted with many of these problems 
in our roles as managers and regulators of the spotted 
owl on federal lands in northern California. The ques- 
tions we sought to address led to a unique collabora- 
tive approach in adaptive management that would 
support informed decision making. We did not begin 
this effort in a formal structured way. Instead, we 
went through a process of trial and error until we 
eventually realized the importance of following a se- 
quential and integrated adaptive approach to address- 
ing species issues at a large-landscape scale. This was 


not an easy process, particularly at a spatial scale cov- 
ering four national forests (more than 2.2 million 
hectares). Although we experienced lessons similar to 
those reported elsewhere, we believe our eventual 
process was unique because we put the main concepts 
of adaptive management into practice using a gen- 
uinely collaborative process. Resource managers and 
specialists developed hypotheses to test in collabora- 
tion with the scientists. We then developed predictive 
models to apply on a large-landscape basis that ad- 
dressed a suite of ecological factors. This was a much 
more direct method than operating separately, as had 
traditionally been done, and it developed trust, open 
communication, and understanding among team mem- 
bers. Although the process was time consuming, it 
turned out not to be difficult. As a result, managers 
(with scientists’ support) will be using the information 
that was generated for guiding future land-manage- 
ment efforts for the owl. Because we were successful 
in applying a structured adaptive process, we believe 
the description of our approach offers a significant 
learning opportunity that has wider application to fu- 
ture resource planning. 


Background 


The northern spotted owl has captured the attention 
and interest of research biologists, land managers, reg- 
ulators, politicians, lawyers, and the public for over 
twenty-five years. It has been the focus of numerous 
management plans by federal, state, and private 
groups (e.g., USFWS 1990; Thomas et al. 1990; Simp- 
son Timber Company 1992; FEMAT 1993; USDA/ 
USDI 1994a,b; The Pacific Lumber Company 1999). 
The Northwest Forest Plan (hereafter Forest Plan) 
established a system of late-successional reserves 
(LSRs or reserves) covering over 24 million acres on 
eighteen national forests and seven Bureau of Land 
Management districts, including the four national 
forests analyzed in this effort. Over a one-hundred- 
year planning period these reserves should provide 
habitat for multiple late-successional-associated 
species, including the spotted owl (see Fig. 19.1 in 
color section). Given this assumption, there was an 
implicit expectation that further analyses to test and 
adapt management approaches would occur. These 
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analyses would provide the information necessary to 
address future changes to species and forest manage- 
ment throughout this period. 

Northern spotted owls are among the most-studied 
and well-known owls in the world (Gutiérrez et al. 
1995) and the best-known owl in northern California 
(e.g., Solis and Gutiérrez 1990; Blakesley et al. 1992; 
Hunter et al. 1995; Zabel et al. 1995; Gutiérrez et al. 
1998; LaHaye and Gutiérrez 1999; Thome et al. 
1999; Franklin et al. 2000). However, major gaps in 
our understanding of spotted owl habitat selection re- 
main, particularly from the large-landscape perspec- 
tive. Survey results were inadequate for large-scale 
analyses because of biases in selection of sites (e.g., 
centered around timber sales), inadequate descriptions 
of survey boundaries, and to a lesser extent variation 
in survey protocol. Lastly, data were not always avail- 
able or in a useable format. Data that were available 
only partially represented the full range of habitat 
conditions found within this ecologically heteroge- 
neous area of northern California. 


Current Situation 


Most planning and regulatory evaluations for owls 
continue to apply the traditional project-by-project or 
site-by-site approach. Interagency efforts to plan proj- 
ects, such as timber harvests, that meet the National 
Forest Management Act (NFMA), ESA, and Forest 
Plan requirements are also hampered by organiza- 
tional and logistical factors. These include differences 
in terminology, inconsistent habitat descriptions, vary- 
ing quality of owl-habitat databases, difficulty of eval- 
uating proposed management activities beyond the 
project or site level to a larger landscape scale, varying 
opinions of individuals involved, and a lack of meth- 
ods to adequately assess cumulative impacts of pro- 
posed management activities in time and space. As a 
result, resource specialists mostly rely on their profes- 
sional judgment to evaluate impacts. In some cases, 
each national forest, U.S. Department of Interior 
(USDI) Fish and Wildlife Service (hereafter Fish and 
Wildlife Service) field station, and U.S. Department of 
Agriculture (USDA) Forest Service (hereafter Forest 
Service) ranger district uses its own unique descrip- 
tion(s) of suitable habitat or evaluation methods. In 


addition, owl surveys are no longer conducted in most 
areas, resulting in a greater dependence on analyses of 
habitat rather than on evaluation of known owl nest 
locations. Consequently, there remains a strong em- 
phasis on evaluating and planning around individual 
owl sites instead of at larger spatial scales. These is- 
sues have resulted in disagreements between the regu- 
latory and land management agencies on owl manage- 
ment, even among personnel from different agencies 
with similar objectives. 


A Collaborative Process in Adaptive Management 


In 1995, the Fish and Wildlife Service and Forest Ser- 
vice in northern California began an informal effort to 
improve the ability of managers to address questions 
about owl management under the Forest Plan. Early 
efforts focused on updating the spotted owl habitat 
database on national forest lands in the California 
portion of the Klamath Province. Although this infor- 
mal approach is common to everyday application of 
resource assessments under the ESA, NFMA, and 
other laws, these early informal efforts to address 
large-landscape issues had little success. Finally, in 
1997, managers formally directed a team of biologists 
from the four national forests and the Fish and 
Wildlife Service's three northern California field offices 
to improve the basis for resource planning and deci- 
sion making under the Forest Plan. This project even- 
tually represented a three-way collaboration among 
resource managers, specialists, and scientists, with 
each team member bringing their own unique expert- 
ise. Although the primary group responsible for this 
effort consisted of wildlife biologists, we were sup- 
ported throughout by a variety of specialists, includ- 
ing resource planners, foresters, forest ecologists, silvi- 
culturists, geographic information system (GIS) 
specialists, fire/disturbance modelers, and ingrowth 
modelers. The term *resource specialist" refers to this 
larger group. The four major tasks the team under- 
took were to 


1. Update and improve the quality of the forest vege- 
tation databases for owl habitat. 

2. Identify and apply more applicable tools to analyze 
and interpret the data at multiple scales. 

3. Determine how to provide the results to decision 
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makers in a form that would be useful for owl 
management at larger landscape and longer tempo- 
ral scales. 

4. Create and implement an adaptive approach to 
owl management on Forest Service lands in north- 
ern California. 


Because of the importance of these steps to large- 
area planning, herein we describe the approach, out- 
come, and implications of our efforts and products. 


Collaboration among Resource Specialists 


The initial basis for successful resource planning, 
whether for single or multiple species, is development 
of a credible up-to-date habitat database and map. 
The recent improvement and general availability of 
GISs has greatly increased our ability to develop these 
products for use at larger spatial scales and with 
greater spatial consistency. Although existing forest 
vegetation and spotted owl habitat databases in north- 
ern California were about twenty years old, each of 
the four national forests had recently made efforts to 
update their timber-attributed databases to support re- 
source planning. These databases set the limits of our 
efforts to develop habitat descriptions and a new map 
that reflected spotted owl habitat use in northern Cal- 
ifornia (i.e., we were unable to include some habitat 
features that we felt were relevant to owls when those 
features did not exist in the GIS databases). 


Map Development, Quality, and Accuracy 


We used published information on owl habitat use 
within the province and expanded the description of 
owl habitat based on limited analyses of known owl 
sites and the vegetation types in which they occurred, 
and the professional judgment of resource specialists 
knowledgeable about owls in the Klamath Province. 
The draft descriptions were evaluated and corrected 
using a modified Delphi approach (Coughlan and Ar- 
mour 1992) until specialists were comfortable with 
the quality of the results. To improve our understand- 
ing of future habitat conditions and trends, we also 
used this approach to describe criteria to identify veg- 
etation that would be capable of becoming owl habi- 
tat in the future. The resulting map was consistent 
with our understanding of owl habitat use and was 


more amenable to evaluating ecosystems, as required 
by the Forest Plan. 

Development of an acceptable map was more diffi- 
cult and time-consuming (nearly three years) than 
anyone on the team expected. The quality and accu- 
racy of the forest vegetation databases and our inter- 
pretation of owl habitat relative to those databases 
were significant issues that had to be addressed. Our 
efforts were hampered by the fact that the existing GIS 
vegetation databases among northern California 
forests were not always compatible (not an uncom- 
mon situation among resource agencies and adminis- 
trative units). For example, each database originated 
from different mapping efforts, and coding or labeling 
was not consistent for the same attributes. In addition, 
resource specialists and managers had rarely ques- 
tioned the quality of the information contained on old 
maps, which made it difficult to ensure map accuracy. 
This resulted in numerous false starts as errors were 
found and the maps had to be recreated. Eventually a 
single seamless map of suitable and capable owl habi- 
tat across the four forests was completed. Based on 
our best professional judgment, we assumed this map 
offered a better basis for analyzing management ac- 
tions on owls under the Forest Plan. 


Collaboration among Scientists 


In response to questions raised about the use and 
quality of the updated vegetation database and habitat 
descriptions, scientists from the USDA Pacific South- 
west Research Station undertook an effort to quanti- 
tatively evaluate the effectiveness of these habitat de- 
scriptions at predicting owl presence-absence. They 
also recommended that formally applying a proba- 
bilistic approach to modeling the landscape for owl 
occurrence would significantly enhance the quality of 
the map. This step represented a significant departure 
from management and regulatory agencies’ traditional 
approach to using available data and maps. Involve- 
ment of scientists required integrating their goals with 
those of management. Consequently, the following 
specific goals were agreed upon: 


1. Develop habitat models for predicting owl pres- 
ence-absence using both the old and new habitat 
descriptions. 
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2. Determine the optimal spatial scale to apply the 
models. 

3. Compare and rank the various models using objec- 

. tive criteria. 

4. Test the highest-ranked models on independent 
data sets. 

5. Evaluate various methods to apply the best 
model(s) for management needs. 


Because of the significance of these products (from 
an ecological and economic perspective), this sci- 
ence-based modeling approach became the major 
focus of our effort and laid the foundation of our 
adaptive process (for a thorough treatment of the 
modeling effort, see Zabel et al. in review). 


Model Development 


Developing habitat-based models to predict the pres- 
ence-absence of wildlife species is relatively straight- 
forward. First, several attributes hypothesized to be 
important to the species in areas that are occupied and 
unoccupied (though apparently available to the 
species) are measured. Then, sites with and without 
the species are compared to determine which attrib- 
ute(s) are most closely associated with presence- 
absence. Alternative models developed in this manner 
are evaluated and compared, and the best model is se- 
lected (e.g., Johnson et al., Chapter 12; Young and 
Hutto, Chapter 8). 

To develop habitat models for this project, we used 
data from sites that had been randomly selected and 
surveyed for spotted owls on national forests in north- 
ern California. These sites had been surveyed accord- 
ing to a standardized protocol for two consecutive 
years (1988 and 1989) so that both occupied and un- 
occupied sites were determined. To facilitate our un- 
derstanding of owl-habitat associations, we developed 
models that discriminated between sites with and 
without owls at three spatial scales using concentric 
circles that approximated different aspects of an owl’s 
home range size: 200 hectares, 550 hectares, and 900 
hectares. Models were developed by placing concen- 
tric circles over the vegetation polygons using 
ARC/INFO software (ESRI 1998) and then calculat- 
ing the quantity of each covariate within those circles. 
Three habitat covariates were evaluated: (1) the total 
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Figure 19.2. Example of linear, quadratic, and threshold forms 
of the relationship between habitat quantity and probability of 
owl occupancy. 


area of nesting, roosting, and foraging habitat; (2) the 
total length of linear edge between habitat and non- 
habitat; (3) and the amount of core area within each 
polygon (defined by buffering each polygon by 100 
meters and determining the interior area). Linear, 
quadratic, and threshold forms of the relationship be- 
tween the probability of owl occupancy and the three 
habitat covariates were then evaluated using logistic 
regression (sensu Franklin 1997) (Fig. 19.2). Six habi- 
tat descriptions were also compared that allowed us to 
take into account different quantities and forms of re- 
lationships between the covariates and probability of 
owl occupancy. 


Ranking and Selecting Models 


We developed approximately one hundred models at 
each of the three spatial scales (200, 550, and 900 
hectares). The bias-corrected Akaike’s Information 
Criterion (AIC; see Burnham and Anderson 1998) was 
used to determine the most parsimonious model(s) 
that discriminated between occupied and unoccupied 
owl sites. The two models with the lowest AIC within 
each of six habitat descriptions, and at each of three 
spatial scales, were selected for further comparison 
and testing on independent data (i.e., a total of thirty- 
six models). 

After critically evaluating the merits of both AIC 
and percentage correct classification, we decided that 
AIC and the percentage of owl-occupied sites correctly 
classified would be used to select the best models. 
Under percentage correct classification, predicted 
probabilities of occupancy are considered correct (as- 
signed a value of 1) if they exceed some predetermined 
cutoff point, and incorrect (assigned a value of 0) if 
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they fall below that cutoff point. Although most statis- 
tical software packages use 0.5 as the arbitrary cutoff 
point, there are many instances when 0.5 may be in- 
adequate. Choice of a probability cutoff point is anal- 
ogous to decisions regarding Type I and II errors. As 
Nichols et al. (1995) noted, in science there is a strong 
bias against Type I errors in which a null hypothesis is 
mistakenly rejected. Therefore, scientists typically as- 
sign a low probability (e.g., 0.05) for Type I errors de- 
spite the fact that lower probabilities of Type I errors 
produce higher probabilities for Type II errors (failure 
to reject false hypotheses) and, hence, to detect real 
differences. As a result, this places the burden of proof 
with resource managers. Because of ESA requirements 
to protect known individuals of a species, the Fish and 
Wildlife Service was more concerned with errors of 
omission (i.e., predicting absence when owls were 
present) than errors of commission (i.e., predicting 
presence when owls are absent). Therefore, we de- 
cided it was more important to correctly predict owl 
presence than it was to correctly predict their absence. 
We separately determined an optimal cutoff point for 
each model based on the following criteria: (1) per- 
centage correct classification of owl-occupied sites was 
greater than 75 percent, and (2) any loss in percentage 
correct classification of owl-occupied sites was more 
than compensated for by a gain in percentage correct 
classification of unoccupied sites. Final model rank- 
ings were based on the average of the AIC rank plus 
ranks of percentage correct classification of owl- 
occupied sites. As recommended by Nichols et al. 
(1995) and Burnham and Anderson (1998), an empir- 
ical Bayesian approach was also used to rank the 
models and compare results. 


Model Testing 


Models should not be used as the basis for manage- 
ment decisions without testing (Conroy 1993). Testing 
should be conducted on truly independent data (Field- 
ing, Chapter 21). Therefore, we selected the best two 
models within each habitat description at each spatial 
scale and then tested them using eight independent 
data sets. Each independent study area had been com- 
pletely censused for owls. Thus, both presence and ab- 
sence were documented, most over periods of longer 
than two years. The study areas were well distributed 


throughout the Klamath Province and provided a rep- 
resentative test of our best models for this region. 
Again, we used both AIC and the percentage of owl 
sites correctly classified to evaluate the performance of 
each model on the independent data sets. This allowed 
us to compare the accuracy of the twelve best models 
at each spatial scale. Model ranks for the best models 
were fairly consistent for both the development data 
set and the test data sets. In addition, the percentage 
correct classification of owl-occupied sites was greater 
than 90 percent for our best models. The approach we 
developed ([AIC rank + percentage correct classifica- 
tion rank] / 2) and the Bayesian approach gave very 
similar results for the top models. This further 
strengthened our confidence in our choice of the best 
models; they fit all of the independent study areas with 
a high degree of accuracy. This exercise produced a 
best habitat model and two potentially competing 
models. We used the best model to evaluate the qual- 
ity of owl habitat across the landscape. 

Owing to the collaboration of researchers and man- 
agers, this phase of our process differed from what 
would have been done had this been a pure research 
project. First, we would not have selected the top two 
models within each habitat description for subsequent 
testing. Instead, we would have chosen a subset of the 
top-ranking models. Our decision to keep the best two 
models within each description was management 
driven. For example, the top-ranking model (habitat 
description) currently used by the management and 
regulatory agencies ranked fifty-fifth using the devel- 
opmental data set, nowhere near the top twelve mod- 
els. However, since it was the habitat description being 
used, it seemed important to give it a *fair chance"? in 
both the model development and testing phases be- 
cause our results could ultimately lead to a change in 
that habitat description. 


A Framework for Future Collaboration 


Although it may seem obvious to some, it is critically 
important for managers and resource specialists to un- 
derstand (at least conceptually) the analytical tech- 
niques that will be used by those who develop wildlife 
habitat models. For example, once the resource special- 
ists and managers on our team understood what AIC 
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was, we collectively made purposeful choices regarding 
the weight we gave it relative to errors of commission. 
If this had been a pure research project, different crite- 
.ria may have been chosen and the results may have 
been less understandable or useful to resource man- 
agers. Equally important, the scientists had to under- 
stand the needs of managers and specialists. This was 
an example of the collaborative interaction between sci- 
entists and managers that embodies the principles of 
conservation biology and adaptive management: the 
application of the best information to management, 
even in the absence of complete information, where the 
results of the application will provide new information. 

To accommodate the application of habitat models 
in resource management, we suggest the following hi- 
erarchical (adaptive) approach to model development, 
testing, and application: 


1. Resource scientists, managers, and specialists 
should work together closely from the beginning 
phases of any planning effort to ensure the useful- 
ness of resulting models. 

2. Models should be developed using data from large- 
enough areas to warrant their application to a vari- 
ety of conditions. 

3. Models should be tested on independent data to 
evaluate their accuracy. 

4. Models should be tested in a manner consistent 
with how they are intended to be used. 


Although incredibly useful for elucidating features of 
the biology of organisms of interest, many wildlife- 
habitat models have fallen short of being applied practi- 
cally. To be fair, many times model application has not 
been the goal of the scientists, although we suspect that 
most expect their work will be of practical use. Regard- 
ing the concerns of scientists versus managers about 
models, Salwasser (1986) noted that “determining ac- 
curacy is the purview of the scientist; practicality, that 
of the manager." Therefore, ultimate decisions on the 
performance of habitat models must be made with or 
by the managers who will be using them. 


Collaboration for Successful Adaptive Management 


The final collaborative step in this exercise was for the 
managers, resource specialists, and scientists to estab- 
lish a basis for interpretation of owl habitat quality so 


TABLE 19.1. 


Questions used to identify and prioritize Late Successional 
Reserves for managing northern spotted owls (Strix 
occidentalis caurina) in the Klamath Province in northwest 
California. 


Question 1 . What is the quality of owl habitat within and 
between reserves and groups of reserves? 

Question 2 Does an opportunity exist (and where) to im- 
prove owl habitat through silvicultural treat- 
ments? 

Question 3 Is there a need (and where) to manage for 


fuel hazard and risk? 


that management recommendations pertinent to the 
scale of the Klamath Province could be developed. 
Our thoughts on approaches and methods for analysis 
evolved as we refined our questions about owl habitat 
relationships through the process of map development 
and model testing. We eventually realized that many 
of our traditional ideas about data analysis and meth- 
ods at the site scale were not appropriate at larger 
scales. Consequently, we strove to complete our ef- 
forts with a more rigorous and collaborative approach 
to developing management recommendations within 
an adaptive management framework. 

The first step was to jointly refine the questions of 
management interest in northern California. Table 
19.1 identifies the three primary questions that we 
agreed were the most significant to both regulatory 
evaluation and owl management under the Forest Plan 
in the Klamath Province. These questions helped focus 
our efforts to select appropriate landscape features, 
evaluate the available data, and use the results to rate 
habitat quality for spotted owls at different scales. 


Application of the Model to the Map 


The primary task in the interpretive process was to as- 
sess the current habitat quality of individual reserves 
and their potential quality. Thus, we evaluated how 
best (or whether it was reasonable) to apply the habi- 
tat model at the scale of a reserve or a group of inter- 
acting reserves. The models generated spatially ex- 
plicit predictions within a large landscape, but the 
absolute results (i.e., the quality of habitat within each 
reserve) were the values of interest. Using the best 
model, we applied a hexagonal grid that covered the 
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Figure 19.3. Example of hexagon grid applied to a group of 
Late Successional Reserves in the Klamath Province in north- 


western California. Each hexagon has the probability of north- 
ern spotted owl (Strix occidentalis) occupancy attached to it. 


landscape using the scale of the model that performed 
best: 200 hectares. The spatially explicit predictions 
were only applicable to this scale. We chose to use 
hexagons rather than circles to apply our model. When 
linked in a grid network, hexagons fully cover the 
landscape, unlike circles, and their shape closely resem- 
bles a circle (Fig. 19.3; Noon and McKelvey 1996b). 
The product of this exercise was a map in which each 
hexagon contained a probability of owl occupancy 
that we assumed to represent habitat quality. This ap- 
proach allowed us to evaluate habitat quality at differ- 
ent spatial scales, from small to large reserves, to 
groups of reserves and intervening forested land. 
Knowledge of landscape patterns and the factors 
that affect them are important to fully understand re- 
source issues (Concannon et al. 1999). For example, 
Perry (1995) discussed the importance of considering 
the role and scale of disturbance in managing land- 


scapes. To ensure an adequate suite of factors were in- 
cluded, we identified a set of other qualitative and 
quantitative factors that were important to evaluate 
the reserves to complement the modeling results. 
These factors included the probability and estimated 
intensity of wildfire, estimates of reserve connectivity 
based on published spotted owl dispersal distances, 
and projections of areas that were capable of becom- 
ing suitable or higher quality in the future. Because it 
is also important to know the scale at which these fac- 
tors interact, we determined the spatial and temporal 
nature of each and whether they lent themselves to 
qualitative or quantitative analysis. For example, fire 
data were provided as probabilities of future occur- 
rence across relatively large areas, while distance be- 
tween reserves was used to assess connectivity. Data 
representing these factors were compiled and tabu- 
lated from other planning documents or databases de- 
veloped by the national forests. Although there were 
concerns about the accuracy and currency of some of 
these data, there were neither useful methods nor 
other data to address them. These factors were mod- 
eled, reported as percentages, or qualitatively summa- 
rized in tabular form to make further comparisons of 
the reserves. 


Application of the Results by Management 


To provide the basis for interpretation of the compiled 
data, a spreadsheet was created that was linked to the 
updated GIS database. Within this spreadsheet, we di- 
vided the more than 2.2 million hectares of national 
forest lands in northern California into different land- 
scape categories associated with the Forest Plan (re- 
serves, non-reserved or matrix lands, and other ad- 
ministratively reserved areas such as wilderness). This 
spreadsheet allowed us to easily evaluate and compare 
results among reserves for a suite of factors pertinent 
to federal owl management at a large-landscape scale. 

The probability results from the hexagon model 
and the summary data from each of the selected fac- 
tors were evaluated and numerical ratings or condi- 
tion indices were generated for each. The resulting 
table of indices provided the basis for a qualitative cu- 
mulative assessment of habitat quality or condition 
for both current and expected future conditions 
within each reserve. The indices and base data for 
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TABLE 19.2. 
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Summary results of Late Successional Reserve (LSR) analyses for each management question about northern spotted owls in 


northern California. 


Management questions 


Number of LSRs with Number of LSRs with 


Ecological Number of Number of LSRs with high priority for high priority for 
zone LSRs analyzed low habitat quality silvicultural treatment fuels reduction treatment 
Western Klamath 18 (0) 2 15 

Eastern Klamath 5 3 1 2 

Western Cascades 2 2 2 0 

Modoc 1l dL T 0 

Interior Coast 7 S 4 0 


each factor were ranked and displayed as a frequency 
distribution. This frequency distribution was used to 
determine whether obvious thresholds or cutoff points 
existed that would relate to levels of management 
interests or needs. Finally, a new table was created 
that rearranged the reserves into threshold categories 
for each factor. This new list was used to rank the re- 
serves for each of the three management questions 
noted earlier (see Table 19.1). Three separate lists 
were generated that prioritized reserves (Table 19.2) 
according to (1) quality of habitat (question 1), (2) 
need for silvicultural treatment to improve quality of 
owl habitat (question 2), and (3) need for fuel or wild- 
fire reduction treatment (question 3). An array of rec- 
ommendations for management and regulatory use 
was then collaboratively determined for each reserve 
based on the cumulative assessment and priority of 
management need. Recommendations included pre- 
scribed burning, thinning, timber harvest, and other 
activities that would contribute to management of 
spotted owl habitat at different landscape scales and 
were specific to the needs of individual reserves. The 
range of recommendations we developed offers flexi- 
bility for resource management at different spatial and 
temporal scales that endeavors to meet owl conserva- 
tion needs. As further conditions or data change, the 
assessment can easily be revised and recommendations 
adjusted accordingly. 


Future Application of Model Results 


Adaptive management, in which science is a substan- 
tial part of planning, evaluating, and modifying man- 


agement strategies, can improve interactions between 
scientists and managers thereby increasing the effec- 
tiveness of planning, allocation, and management of 
resources (NAS 1997). To ensure the results of our ex- 
ercise have lasting utility to both the regulatory and 
management agencies, we developed four products to 
support future planning and decision making: (1) a 
comprehensive database and seamless map (and asso- 
ciated metadata), (2) a table that ranks and lists rec- 
ommendations for each reserve and larger area, (3) 
guidelines for using the information and tools, and (4) 
a procedure for incorporating new information and 
adjusting recommendations. We expect these products 
to be used by managers to draw reasonable and sup- 
portable conclusions about owls and owl habitat at 
scales much larger than an individual owl site, allow- 
ing for more-efficient land management planning and 
fulfillment of regulatory requirements under the ESA 
and NFMA. For example, the model can be used as a 
planning tool to help regulators evaluate potential ef- 
fects of management activities and to identify areas 
where projects (management activities) such as timber 
harvests are most likely to improve owl habitat quality 
or to minimize the reduction of habitat quality. How- 
ever, we realize this cannot be accomplished without 
educating staff and managers to use the products and 
process we have developed. 

We envision an active approach to continuing ap- 
plication of our efforts, as described by Nichols et al. 
(1995). Adaptive management treats management as 
an experiment and evaluates whether the desired and 
hypothesized outcome emerges after some period of 
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time (Gunderson 1999). In our case, the experiment 
was evaluating a large number of competing habitat 
models. Our treatment is implementing the new 
map/model in resource management where implemen- 
tation has a hypothesized outcome, in particular, sta- 
bility of owl populations in the long term and a reduc- 
tion in the rate of population decline in the short term 
(5-10 years). Although the specific steps to do this 
have not yet been implemented, we have identified a 
number of key activities that will allow us to continue 
to test, improve, and revise both the products and 
their implementation over time. 

To ensure that our initial attempt at applying an 
adaptive approach succeeds, several critical tasks re- 
main. There is a need to continue to collect and test 
data because of the complexity of owl-habitat rela- 
tions, the stochastic nature of forest dynamics, and the 
recognition that our efforts reflect only what we cur- 
rently know. This will provide the data to allow our 
predictions to be tested (Walters 1997). A major step is 
to integrate this exercise into the owl-monitoring pro- 
gram so that results from studies of spotted owl demo- 
graphics can be used to link demographic performance 
with habitat quality and management actions. This will 
allow us to evaluate associations among management 
activities (actions), probability of owl occupancy, and 
demographic parameters. Related to this is the need to 
test the results of our *Delphi" approach to refine the 
different habitat descriptions used in the five ecological 
zones and in particular to investigate the effects of vari- 
ation in vegetation structure within these different 
habitat types. We made other assumptions that can 
and should be tested concurrently with this process. 
These include evaluating whether the owl is an indica- 
tor or umbrella species for other late-successional 
species, investigating the utility of this process to ad- 
dressing other species/habitat conflicts, and under- 
standing how forest management and manipulation of 
habitat quantity and distribution affects spotted owl 
and barred owl (Strix varia) interactions. Future suc- 
cess, however, is predicated upon a continuing effort to 
improve and maintain GIS vegetation databases, using 
new remotely sensed and ground-plot data, and ensur- 
ing that the databases accurately reflect continuing 
changes in the forests due to fires and other distur- 
bances. A formalized cyclic approach needs to be un- 
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Figure 19.4. Conceptual model of the collaboration process 
among managers, resource specialists, and scientists. 


dertaken in which use and revision of the above prod- 
ucts are linked with research and monitoring pro- 
grams, both at the regional and national-forest scale. 

We hope that the application of our best model to 
the landscape and our focused concern on reserves 
that are not currently providing well for owls will re- 
sult in a reduction in the rate of population decrease 
in northern California in the short term (ten years). 
However, because of the new approach described 
here, we must view these products as a first generation 
that will be improved and revised as we learn more 
about analyses, management, and owl habitat use and 
population dynamics at this scale. As a result, we will 
need to continue to work to integrate this process into 
future planning efforts so that our respective field 
units can repeat it indefinitely. This will enable us to 
routinely test and revise our results and associated 
management recommendations as new information 
becomes available—in other words, to practice adap- 
tive management (Fig. 19.4). 


Lessons Learned 


Although we did not begin this effort under an adap- 
tive management framework, the learning process it- 
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self became an adaptive process by default. Over the 
four years of this effort, we learned much about the 
steps involved and the problems encountered in using 
.large-scale resource information for management in a 
structured and adaptive setting. In particular, main- 
taining contacts among disciplines and between scien- 
tists and managers has become more critical than in 
the past. However, there is an increasing workload as- 
sociated with resource management, primarily due to 
the complexity and general lack of understanding 
about ecosystem management at any scale. Therefore, 
although the following is by no means an exhaustive 
list, we offer a synopsis of the key lessons learned that 
may be helpful to others attempting to undertake sim- 
ilar efforts. 


Collaboration 


Close collaboration between managers and scientists 
has not been the traditional approach to resource 
management. Large-scale resource planning, especially 
from a landscape perspective, requires an interdiscipli- 
nary approach, specialized knowledge, open commu- 
nication, team compatibility, and integrated thinking. 
Although it has been recommended that scientists and 
managers should have a "translator" to foster com- 
munication (Schonewald-Cox 1994), we realized it 
was more critical for our team to have each group do 
the translating. Through this interaction, we eventu- 
ally realized that scientists, managers, and resource 
specialists need to work closely from the beginning 
phases of any planning effort and should maintain 
their collaboration into implementation. By applying 
an adaptive and collaborative approach from the 
start, research could more easily be directed to sup- 
port management needs, and management could more 
efficiently take research findings into consideration, 
thus reducing uncertainty and improving resource 
management. Although we did not include representa- 
tives from special interest groups on our team, we 
gave several public presentations to such groups. 
Based on their comments, all seemed supportive of our 
process and conclusions. 


Changing Paradigms 


Dealing with change, particularly change brought 
about by applying new concepts, was critical in our 


endeavor to adaptively manage. However, there con- 
tinues to be resistance among people and institutions 
to change. For example, even six years after the Forest 
Plan mandated the change from project-scale to 
large-area planning, this shift had not occurred. Seek- 
ing, analyzing, and applying new information and 
methods involves taking levels of risk that make some 
people and institutions uncomfortable. We believe 
that people will be more open to change if they are in- 
cluded from the beginning phases of a project rather 
than having new systems imposed on them from 
higher levels of government. Adaptive management is 
a structured and formal approach that requires fo- 
cused and collaborative efforts to successfully inte- 
grate it into everyday operations. That had not been 
our experience as agency resource specialists or scien- 
tists, where it was often treated as an additional task 
or sometimes a constraint, if it was applied at all. 
Adaptive management offers a potential solution to 
dilemmas encountered when managing natural re- 
sources, such as uncertainty, conflicting information, 
and how to evaluate whether management is success- 
ful, and if not, why not (Lancia et al. 1996). We need 
to take a proactive approach to acquire the informa- 
tion necessary to avoid reacting to a problem after it 
has occurred. This is particularly important given the 
assumptions that underlie management policies, espe- 
cially over these large areas. 


Temporal and Spatial Scales 


Temporal and spatial scale (extent and grain) issues 
are poorly understood in resource management, par- 
ticularly when evaluating larger landscape units. We 
often found that data we had used to make manage- 
ment decisions prior to this project were applicable 
only at the site or local level and often had little rele- 
vance to questions that were pertinent to resource 
management at larger scales such as reserves or eco- 
logical zones. Although not usually considered, we 
should recognize that landscape goals dictate the level 
of analysis. By analyzing the context of an action 
within the larger landscape, we felt that we were bet- 
ter able to understand not just the effects of an action, 
but also the significance of those effects important to 
the scale of the Forest Plan. Managers and scientists 
must continue to ask whether a species needs to be 
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managed at a coarse or fine scale, and to identify ques- 
tions and apply techniques that are appropriate to that 
scale. 


Data Quality and Availability 


There is an increasing need for more resource data 
that address critical management questions, and par- 
ticularly for data and maps of known quality and ac- 
curacy to carry out large-area assessments such as this 
one. The quantity of new and existing information 
about spotted owls is immense (Fig. 19.5). Even with 
all of the previous work completed on the spotted 
owl, we were continually surprised at how poorly 
data were maintained, how inconsistently they were 
reported, or how few data were accessible or even use- 
ful. This problem alone caused the most frequent and 
longest delays in our effort. This, coupled with the 
rapid rate of emerging ideas on habitat analyses, re- 
source selection, mathematical and statistical models, 
and issues of scale, makes it extremely difficult for 
agency managers and specialists to remain current. As 
agencies attempt to improve their efforts toward 
ecosystem management, major emphasis needs to be 
placed on maintaining and spatially linking data, 
keeping data accessible, and using long-term data sets, 
all in collaboration with scientists. 


Methods and Tools 


There is a general lack of applicable and easily used 
methods or models for resource specialists and man- 
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Figure 19.5. Cumulative number of publications on northern 
spotted owls (Strix occidentalis caurina) from 1985 through 
1998. : 


agers to apply when addressing large-landscape-level 
questions. The lack of supported methods and incon- 
sistent terminology continually hampered our project. 
In addition, the way in which we used tools such as 
GIS add a level of complexity, cost, and time that man- 
agers and resource specialists are reluctant to fund. Be- 
cause of skepticism regarding conceptual or theoretical 
approaches, testing or piloting new methods in real sit- 
uations with actual data is critical and should be a nor- 
mal part of the process (Ringold et al. 1999). This is 
particularly important in testing assumptions and ad- 
dressing the relationships in species/habitat interac- 
tions. We agree with the suggestion of considering mul- 
tiple models (rather than a single most-probable 
model) in developing management strategies and then 
assessing their relative credibility by comparing com- 
peting predictions with subsequent observations (Con- 
roy 1993). We also agree with the perspective of Young 
and Varland (1998) regarding “meaningful” research 
in a management environment—in other words, re- 
search that can be used to help make management 
decisions. 


Conclusions 


Our process is amenable to new information (e.g., dis- 
persal data, habitat relationships in additional areas, 
etc.) that may emerge in the future. It is specifically set 
up to be an adaptive (repeatable) process that will fur- 
ther the progress we have made through this collabo- 
rative effort. Using an adaptive approach should not 
only change the way we work but also should make 
our work more efficient and proactive. It is our hope 
that this analytic process will serve as an effective 
model during future efforts to develop a comprehen- 
sive owl conservation plan for all public and private 
lands in northern California. 

Close collaboration increased our appreciation and 
understanding of each other's perspectives and priori- 
ties. This effort was not easy and was at times frus- 
trating (see Hejl and Granillo 1998 for additional in- 
sights). Had we not learned to work together, 
however, we would not have gained the ability to shift 
our way of viewing, and thus managing, the landscape 
from a deterministic to a probabilistic manner. Project 
impacts would have continued to be evaluated indi- 
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vidually without looking at cumulative effects in time 
and space. We would have had no quantitative way to 
guide project planning into the future as our knowl- 
edge improved and changed. We strongly encourage 
this model of collaboration among scientists, resource 
specialists, and managers because we found it was 
fundamental to successful adaptive management. 
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Modeling Wildlife Distribution within 
Urbanized Environments: An Example of the 
Eurasian Badger Meles meles L. in Britain 


Amanda Wright and Alan H. Fielding 


he impact of urbanization has rarely been ad- 

dressed within wildlife-habitat models, and yet 
the rate of urban development is an ongoing issue in 
wildlife conservation. Most wildlife-habitat models 
are developed within relatively natural, undisturbed 
environments. However, their applicability to popula- 
tions within urbanized locations have rarely been ad- 
dressed (Boal and Mannan 1998). One such species 
that is influenced within part of its range by urban de- 
velopment is the Eurasian badger (Meles meles). The 
badger is a protected species within the United King- 
dom through the Protection of Badgers Act, 1992, 
which also covers any sett (the burrow system occu- 
pied by badgers) with indications of badger use. Bad- 
gers live in social groups, usually occupying one main 
sett within a defended range. A main sett is distin- 
guished from other setts within a territory by its con- 
tinuous use by the social group (Kruuk 1978). Sett 
losses rather than mortality of individual badgers are 
the most significant population threat (Harris et al. 
1995). Consequently, species protection requires 
knowledge of the distribution of main setts. In urban 
areas, particularly at the urban fringe, the landscape is 
changed by rapid land development (Trietz et al. 
1992; Bolger et al. 1997). The impact of developments 
such as housing and road-building schemes, and the 
need for mitigation in some cases for the translocation 
of badger social groups, requires knowledge of the 


distribution and habitat requirements of badgers 
within urbanized environments. Previous studies on 
rural badgers (e.g., Wiertz and Vink 1986; Thornton 
1988; Cresswell et al. 1990) have shown that setts 
tend to be more common in well-wooded, hilly areas 
at an altitude between 100 and 200 meters with a soil 
that is relatively easy to dig. However, it is not clear 
that these can be applied to badgers living in an ur- 
banized landscape. 

We investigated whether habitat factors considered 
important for determining badger distribution else- 
where within its range were applicable to main-sett 
distribution across a heavily urbanized area, with 
main setts used as a surrogate for social-group distri- 
bution. Our main objective was to determine whether 
areas used for main sett construction could be discrim- 
inated from unused areas within the urbanized land- 
scape, but we also wanted to determine whether a 
model produced by discriminating used from unused 
sites would be influenced by land-cover change in 
terms of its stability and applicability. 


Study Area 


The urbanized landscape used within the study was 
the county of Greater Manchester, UK (53?29' N, 
2°15’ W), which covers an area of 1,286 square kilo- 
meters. It is a heavily urbanized conurbation with a 
polycentric structure. The human population is just 
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over 2.7 million (Office of Population Census 1991). 
Although described as an urban county, approxi- 
mately 50 percent of its area is occupied by farmland, 
woodland, and upland moorland, which is peripheral 
to a polycentric urban core. The county therefore has 
a highly varied landscape enhanced by variation in el- 
evation and geology. See Wright (1997) for a more de- 
tailed description. 


Methods 


Four data layers (elevation, slope, soil, and land 
cover), which are related to known badger habitat re- 
quirements within rural areas (e.g., Neal 1972), were 
imported into a geographic information system (GIS). 
Elevation data were acquired from the Ordnance Sur- 
vey in the form of a digital terrain model (DTM) with 
a resolution of 50x50 meters. The DTM was reclassi- 
fied to produce discrete interval scale of 50 meters ele- 
vation data. In addition, the DTM was processed 
within the GIS to obtain slope data. The slope image 
was reclassified to produce a discrete interval scale of 
slope in 5-degree divisions. A 100x100-meter pixel 
resolution image of soils within Greater Manchester 
was made available to us under license from The Soil 
Survey and Land Research Centre, Cranfield Univer- 
sity. Greater Manchester contains twenty-two soil as- 
sociations that were reclassified into eight major soil 
groups as described by Avery (1980). The final data 
layers were land-cover images of Greater Manchester 
produced by supervised classification of Landsat the- 
matic mapper (TM) satellite images captured May 
1988 and June 1997. 

Species data were acquired by surveying random 
1-square-kilometer areas of the county for badger setts 
over June-September 1992-1996. The survey covered 
approximately 40 percent of the county. Accounts of 
sightings or records of sett presence from alternative 
sources (general public, countryside rangers, etc.) were 
also used to assist the survey. See Wright et al. (2000) 
for details of the sampling methods. Over the five-year 
study period sixty-five main setts were located within 
the county of which thirteen became disused. For 
model construction, forty-seven used main setts and 
the thirteen disused setts were used, with an additional 
five active main setts retained for model validation. 


A control group, randomly selected from the popu- 
lation of all available sites avoided by the badgers, 
against which the data for suitable habitat locations 
could be compared, was established. The random 
non-sett sites were obtained using randomized grid co- 
ordinates, which did not contain main setts within a 
1-square-kilometer area of the grid reference. Loca- 
tions that had not been surveyed previously, along 
with those where there was uncertainty about badger 
activity, were omitted. A data set of six hundred ran- 
dom 1-square-kilometer non-sett squares was created. 
More absent locations were used than locations of 
badger presence, since the former were expected to 
show more variability (Pereira and Itami 1991; Field- 
ing and Haworth 1995). 


Statistical Analysis 


For both data sets (setts and non-sett squares), a 
1-square-kilometer area was extracted from each data 
layer. The extraction of the habitat data from the four 
data layers revealed a complex data set of forty-six 
variables, some of which were highly correlated. The 
dimensionality of the habitat data was reduced using 
principal components analysis (PCA) prior to discrim- 
ination. This produced fewer dimensions that were in- 
ternally correlated but that were unique with regard to 
other derived dimensions (Krzanowski 1988). Only 
those components having an eigenvalue greater than 
one were retained for further analysis. A direct dis- 
crimination method (James and McCulloch 1990) was 
employed. The correctly classified rate was tested 
against that obtained by chance using a normalized 
mutual information (NMI) measure (Forbes 1995; 
Fielding and Bell 1997). From the NMI and its vari- 
ance, a one-tailed z test was calculated to determine 
statistical significance of the observed accuracy of the 
classifier. Since the classification produced an accuracy 
higher than expected by chance, a statistical model 
was constructed. The model was built using the un- 
standardized canonical discriminant function coeffi- 
cients derived within the discriminant analysis. Un- 
standardized coefficients can be used to calculate the 
discriminant score for a new case (Huberty 1994), 
which could be compared with the mean scores calcu- 
lated for the two groups (sett squares and non-sett 
squares). The discriminant scores for each case (N - 


660) were saved and imported into a GIS. These 
scores were interpolated using an inverse distance 
weighting method. The resulting surface was a 
weighted linear combination of the principal compo- 
nent axes. This discriminant score surface was a visual 
representation of the predictive discriminant model, 
and hence the surface predicted the presence/absence 
of main setts. It should be noted that the model only 
identifies “potential” habitat (based solely on habitat 
variables included in the analysis), which does not 
imply that the species is actually present at a given lo- 
cation. This model will be referred to as model one. 
The whole process was repeated using land-cover data 
derived from a supervised Landsat TM image cap- 
tured June 1997 in order to determine the perform- 
ance of the model after land-cover change within the 
county. This model will be referred to as model two. 
Variable ordering was performed on both models, 
using the leave-one-out method, omitting one PCA 
variable in turn, and obtaining the total group hit 
from which a Zi value can be calculated (Huberty 
1994). The best predictor was the one associated with 
the lowest Zi value. Ecological modeling has little 
merit if the predictions cannot be, or are not, tested 
using independent data (Verbyla and Litvaitis 1989; 
Fielding and Bell 1997; Beutel et al. 1999). Therefore, 
three approaches were used to validate the models: (1) 
training and testing data using cross-validation (Efron 
1982), (2) overlaying the locations of setts not used 
within the model construction onto the models within 
a GIS and extracting discriminant scores for those lo- 
cations, and (3) field surveys of unsurveyed areas high- 
lighted as suitable sett locations by the models. 


Results 


The PCA for model one produced twelve normally dis- 
tributed axes that explained 72.9 percent of the varia- 
tion of the original data set. Following input into the 
discriminant analysis, these axes separated the data 
into two groups (%2 = 141.473, df = 12, P < 0.0001). 
An overall classification accuracy of 81.7 percent was 
calculated, with 72 percent of sett squares and 82 per- 
cent of non-sett squares correctly classified (Table 
20.1). An NMI of 0.187 was calculated for the confu- 
sion matrix (Table 20.1), where z = 4.14, P « 0.0001, 
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indicating a classification significantly better than ex- 
pected by chance. The main discriminating variables 
are shown in Table 20.2. The sett squares were dis- 
criminated from the random non-sett squares by posi- 
tive scores for PC 1 (a measure of slope), a positive 
score for PC 6 (a gradient from woodland and grass- 
land to suburban and urban land cover), and by a neg- 
ative score for PC 2 (an upland axis). Figure 20.1a 
highlights areas suitable for sett locations as those pri- 
marily found to the east and north of the county, or in 
other words those areas having a discriminant score of 
-1 to —3 (group mean discriminant score for sett 
squares —1.555). However, unsuitable areas border 
these on both sides, with low-lying urbanized land to 
the center of the county and upland moorland areas 
further north and east. 

The PCA for model two produced fifteen normally 
distributed axes that explained 73.8 percent of the 
variation of the original data set. Following input into 
the discriminant analysis, these axes separated the data 
into two groups (x2 = 176.883, df = 15, P < 0.00001) 
with an overall classification accuracy of 83.31 percent 
(Table 20.1). An NMI of 0.113 was calculated for the 
confusion matrix (Table 20.1), where z - 4.365, 
P « 0.0001, indicating a classification significantly bet- 
ter than expected by chance. The main discriminating 
variables are shown in Table 20.2. The sett squares 
were discriminated from the random non-sett squares 
by a positive score for PC 1 (a measure of slope), a 
positive score for PC 11 (a measure of woodland 


TABLE 20.1. 


Confusion matrices for the two models showing the predictive 
discriminant analysis classification results for sett? squares 
and random, non-sett squares. 


No. 

Actual group samples Predicted group 
Model one 1 2 

1 Used sett squares 60 43 (71.7%) 17 (28.3%) 
2 Non-sett squares 600 108 (18.0%) 492 (82%) 
Model two 

1 Used sett squares 60 41 (68.3%) 19 (31.7%) 
2 Non-sett squares 600 91 (15.2%) 509 (84.8%) 


aA sett is the burrow system occupied by the Eurasian badger (Meles 
meles). 
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TABLE 20.2. 


Descriptions of the main discriminating principal components and their constituent variables in predicting 


the occurrence of Eurasian badger (Meles meles) setts. 


Model one Model two 
PCA Rotated factor PCA Rotated factor 
axes Variable loading axes Variable loading 
di Slope 25-30? 0.9184 al Slope 20-25? 0.9216 
Slope 30-35? 0.8982 Slope25-30? 0.9145 
Slope 20-25? 0.8743 Slope15-20? 0.8685 
Slope 15-20? 0.7894 Slope10-15° 0.7737 
Slope 35-40? 0.7068 Slope30-35? 0.7601 
Slope 10-15? 0.6724 Altitude 251-300 m 0.6127 
Altitude 251-300 m 0.4752 Altitude 300-351 m 0.6126 
6 Urban -0.7732 du Coniferous wood 0.7240 
Improved grass 0.6646 Deciduous wood 0.5543 
Deciduous wood 0.6382 
Semi-improved grass 0.4872 5 Improved grass 0.7351 
Suburban -0.3456 Urban -0.6342 
Mineral workings -0.6150 
2 Bracken 0.8525 Semi-improved grass 0.5550 
Altitude 351—400 m 0.7685 
Rough grassland 0.7615 
Altitude 301-350 m 0.7466 
Moorland grassland 0.6812 


Note: PCA 2 for Model one has a negative relationship with the discriminant group mean score for that model. All other 
axes (for both models) are positively related to the group mean score, or, in other words, the standardized canonical dis- 


criminant function coefficient is positive. 


cover), and a positive score for PC 5 (a gradient from 
grassland to urban land cover). Figure 20.1b highlights 
areas that are potentially suitable for sett locations, 
which follow a similar spatial arrangement as model 
one. The group mean discriminant score for sett 
squares was «1.765 and —0.17681 for non-sett squares 
for model two. 


Model Validation 


Cross-validated classifications of the data for model 
one and two produced an overall classification accu- 
racy of 81.2 and 82.6 percent respectively (Table 
20.3). The classification accuracies were significantly 
higher than expected by chance, with for model one 
an NMI of 0.182 (z = 4.09, P « 0.0001) and for model 
two an NMI of 0.178 (z = 4.01, P < 0.0001). 

A point vector file of the geographical locations of 
the five setts used for testing the model was overlaid 


onto the two discriminant score surface images (Fig. 
20.1) within a GIS. A query of the discriminant score 
at those points gave discriminant scores between 0 
and —3 for the five locations on model one (sett square 
group mean score = —1.555). For model two the scores 
ranged from 0 to +2 (sett square group mean score = 
+1.765). Therefore, the scores for the five setts used 
for testing the model indicated that both models could 
identify suitable sett sites, although model two was 
more conservative in its predictions than model one. 
Two areas highlighted as potentially suitable for 
sett presence by the models were ground-truthed for 
sett presence/absence in order to provide additional 
evidence for the performance of the models. In area 1, 
an occupied sett was located. Its status as a main sett 
was not determined due to time restrictions. Within 


area 2, although an active sett was not located, evi- 
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Figure 20.1. Discriminant score surfaces for (a) model one, and (b) model two. The group 
mean scores for sett squares and non-sett squares were different for the two models, result- 
ing in a change of sign for the same geographical location from model one (sett squares group 
mean score = —1.555) to model two (sett square group mean score = 1.765). A sett is the 
burrow system occupied by Eurasian badgers (Meles meles). 


dence of occupation was found through the presence 


of hair and an active latrine. 


Misclassifications 


Both models misclassified some data points, with the 


misclassifications lying in approximately the same di- 


rection. That is, relatively similar proportions of sett 
and non-sett squares were placed in the incorrect 
group by the classifier. However, model one misclassi- 
fied more non-sett squares and model two misclassi- 
fied more sett squares (Table 20.1). Of the sixty sett 
squares, seventeen were misclassified by model one 
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TABLE 20.3. 


Confusion matrices for the two models showing the cross- 
validated discriminant analysis results for prediction of 
Eurasian badger (Meles meles) setts. 


No. 


Actual group samples Predicted group 


Model one T 2 


1 Used sett squares 60 43 (71.6%) 17 (28.3%) 
2 Non-sett squares 600 108 (18.0%) 492 (82.0%) 
Model two 

1 Used sett squares 60 40 (67.0%) 20 (33.0%) 
2 Non-sett squares 600 95 (16.0%) 505 (84.0%) 


and nineteen were misclassified by model two (Table 
20.1), with fifteen setts common to both models. Of 
the sett squares misclassified by both models (N = 15), 
twelve contained large areas of suburban and/or 
urban areas within their sample square. For example, 
one was adjacent to a large airport. Two were situated 
on low-lying flat farmland prone to periodic flooding. 
The final misclassified sett was situated in rough grass- 
land adjacent to open moorland. 


Discussion 


We have demonstrated that areas selected for use can 
be discriminated from those avoided across an urban- 
ized landscape (after Morrison and Hall, Chapter 2). 
Both models successfully classified at a rate considered 
appropriate for ideal management models (Mosher et 
al. 1986). The models were validated and again suc- 
cessfully discriminated sett squares from non-sett 
squares. There was evidence of a reduction in the abil- 
ity to correctly discriminate sett squares from model 
one to model two (Table 20.3). This may indicate that 
the change in land cover from 1988 to 1997 is in a di- 
rection that could be considered unsuitable for badger 
occupation, such as an increase in suburban land 
cover at the expense of grassland and woodland sur- 
rounding a sett site. 

Habitat selection procedures are dominated by dif- 
ferent variables at different scales (Manly et al. 
1993). For example, the habitat selection procedure 
of the Mt. Graham red squirrel (Tamiasciurus hud- 
sonicus grahamensis) is dominated by terrain vari- 


ables at the overall landscape, while vegetation char- 
acteristics were more important at a finer resolution 
(Pereira and Itami 1991). We consider this the case 
within the present study where terrain variables were 
prominent in discriminating used and unused areas. 
The main discriminating variables also included land 
cover as well as terrain descriptors. For example, de- 
ciduous woodland, and semi-improved and improved 
grassland were positive discriminators for sett pres- 
ence. However, this was to be expected as land cover 
is correlated with terrain. The generalized habitat as- 
sociations found within the present study agree with 
those shown in previous studies (e.g. Wiertz and Vink 
1986; Thornton 1988; Cresswell et al. 1990) in that 
badgers were most common in well-wooded, hilly 
areas. The habitat associations that emerged probably 
represent a response to a basic configuration of the 
environment (James 1971). It appeared that the 
coarse habitat association, evident for the distribution 
of badger setts within a heavily urbanized area, was 
similar to associations found elsewhere within other 
parts of its range. 

The same setts were misclassified by both discrimi- 
nant analyses (Table 20.1). These were setts situated 
in nontypical locations (as identified by previous au- 
thors: e.g., Thornton 1988; Cresswell et al. 1990; 
Macdonald et al. 1996), such as on low-lying flatland, 
or in upland areas without woodland. Those setts lo- 
cated in the more traditional badger habitat of decidu- 
ous woodland, with semi-improved or improved 
grassland, on sloping land were correctly classified as 
sett squares. The increase in the number of setts mis- 
classified by model two relative to model one may be 
indicative of the change of land cover within the 
county over the nine-year period. This can be seen in 
an increase in land cover considered to be indicative of 
absences, for example, an increase in suburban land 
cover (from 25 percent cover within the county in 
1988 to 31 percent cover in 1997 as calculated from 
the classified land cover images). 

The misclassifications may have been caused by a 
variety of factors. First, the cause could be due to the 
scale of the habitat data: mixed pixels commonly occur 
in remotely sensed images, especially those with a 
coarse spatial resolution (for example 30x30-meter 
pixel resolution for the land cover images). Since image 


classification routines assume homogeneous pixels, the 
presence of mixed pixels will degrade classification ac- 
curacy (Foody and Cox 1994). This may have led to 
some inaccuracy in allocating cover types due to the 
coarse spatial resolution. Image classification may also 
be a cause of some misclassifications. The creation of 
training areas is a relatively subjective procedure, with 
inaccuracies arising from variations in pixel clarity 
within training sites. With two separate land cover im- 
ages used within the present study, captured from dif- 
ferent years, the training sites used in classifying the 
1988 satellite image could not be used to classify 
the 1997 satellite image. Therefore, variation within 
the purity of the polygons used for training may have 
led to some inaccurate labeling of land cover within 
the sample squares. A second cause of misclassifica- 
tions may again be due to the land-cover image. 
Within urban areas, particularly at the urban-rural 
fringe, temporal change in land use and cover is very 
rapid (Trietz et al. 1992). Therefore, the land cover 
surrounding a sett at the time of image capture may 
not be representative of the land cover within that lo- 
cation when the sett was constructed; in other words, 
the sett may now be surrounded by less-suitable habi- 
tat. Many of the misclassified setts contained large 
areas of suburban land cover, such as recently con- 
structed housing estates, within their sample squares. 
This location of setts within suboptimal habitats sug- 
gests setts are a valuable resource for badgers (Roper 
1993). There is evidence of strong fidelity between a 
social group and a main sett, with badgers reluctant to 
leave a well-established sett. A third cause of misclassi- 
fications include those setts constructed within appar- 
ently atypical habitats. For example, two of the mis- 
classified setts were located on flat land at an elevation 
of less than 25 meters in an area prone to periodic river 
flooding. However, at a finer scale the setts were situ- 
ated on small sloping banks thereby avoiding any po- 
tential for flooding. Therefore, it appears that for many 
misclassified cases, the spatial resolution of the input 
data contributed to their misclassification. 

Habitat selection may have been influenced by tem- 
poral changes (Orians and Wittenberger 1991; Morri- 
son et al. 1992; Greco et al., Chapter 14; Johnson and 
Krohn, Chapter 13) or perhaps by factors not easily 
quantified, such as anthropogenic disturbance. Skin- 
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ner et al. (1991) found that the loss of main setts 
within a county in England over a twenty-year period 
was attributable to a variety of factors, in particular, 
agricultural activities and increased urbanization and 
industrialization. These temporal changes may also 
account for the present distribution of main setts 
within Greater Manchester and therefore the differ- 
ence between used and unused areas. Aaris-Serensen 
(1987) found that sett disturbance by the public, and 
in particular the dog-walking public, was one of the 
major factors contributing to a 33 percent decline of 
the badgers in the Copenhagen area over a ten-year 
period. Although such factors may be influential in 
distinguishing between used and unused areas, they 
would be difficult to obtain and quantify across a 
landscape. 

The combination of the habitat data provided a 
high ability (81.2 percent for model one and 82.3 per- 
cent for model two) to correctly classify the sample 
squares over a heavily urbanized area. However, clas- 
sification accuracy for sett squares was less than the 
overall classification accuracy, especially for those 
setts situated within more urbanized locations. This 
raises the issue of the cost of the misclassification and 
the quality of misclassified setts (Fielding and Bell 
1997). In terms of the performance of the model, a 
Type II error could be considered to be more expen- 
sive than a Type I error. If the model fails to identify a 
suitable environment for sett construction by labeling 
it as an absence site, then its use as a planning aid may 
be limited. Some setts within Greater Manchester are 
at risk through development programs. One social 
group was translocated within the study period due to 
the construction of an industrial park, and other setts 
were likewise affected by similar large-scale engineer- 
ing projects, such as road-bypass schemes. However, 
with misclassifications inevitable within predictive 
models (Fielding and Haworth 1995) it seems appro- 
priate to support the call for caution in using the pre- 
dictions of computer models to make real-life deci- 
sions (Corsi et al. 1999), especially the use of models 
whose predictions have not been tested in the field 
using independent data. 

The value of the models lies within their applicabil- 
ity at the landscape level for the identification of suit- 
able sett sites, although they may be limited by the res- 
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olution of the input data. The models could enable 
threats to individual setts to be placed in a broader 
context, for example, when assessing planning appli- 
cations. The usefulness of the models is in their 
construction and the techniques used within their de- 
velopment, with the results of model two indicating 
stability within the methodology used. Within a heav- 
ily urbanized county with a small badger population, 
the models were able to identify topographical and 
land-cover characteristics associated with main sett lo- 
cations. For setts situated in favorable or typical loca- 
tions, the models provide a high degree of accuracy 
and highlight areas of potential sett locations. For 
those in less favorable locations, the model may pro- 


vide a means of identifying changes or restrictions in 
appropriate resources and point toward the need for 
conservation measures. 
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Modeling Tools and Accuracy Assessment 


Randall B. Boone and William B. Krobn 


A: the number of ecological and management 
questions with spatial and temporal components 
has increased, so has the number of modeling tools 
used to predict species occurrences and abundances 
across space and time. In turn, the methods used to 
assess the accuracy of predictions of species occur- 
rence have become more complex. The chapters of 
this section reflect that complexity. Authors explore 
methods of determining sample sizes, use a variety of 
modeling methods (e.g., classification and regression 
trees, artificial neural networks, genetic algorithms, 
Poisson regression, autologistic regression, Pearson’s 
planes of closest fit, proportional odds models, and 
identification based upon habitat use categories), and 
employ statistical and graphical reports of model 
accuracy. 

Complexity in models that predict species occur- 
rence and the assessment of these models is war- 
ranted; the systems affecting species occurrence are 
often complex (Huston, O'Connor, Wiens, Introduc- 
tory Essay, introduction to Part 1, and Chapter 65, 
respectively). Modelers must concern themselves 
with how species perceive the world (their habitats) 
and if humans can perceive the world in similar ways 
(Morrison et al. 1992). Further, these habitat compo- 
nents must be mapped to be used in spatial model- 
ing. Habitats change over time and space due to dis- 
turbance (e.g., fire, hurricane, human development), 
succession (e.g., Litvaitis 1993), and climate change. 


Habitat associations of a given species can vary over 
time (Morrison et al. 1992; Johnson and Krohn, 
Chapter 13), across regions (Collins 1983; Krohn 
1996; Smith and Catanzaro 1996), and across differ- 
ent population densities (Brown 19692; Fretwell and 
Lucas 1969). Species can be very common or very 
rare (Hanski 1982), and a given species can be abun- 
dant or rare in different parts of its range or through 
time (Maurer 1994). Lastly, range edges are difficult 
to define (Price et al. 1995) and often change 
(Hengeveld 1990). Given that population levels can 
vary markedly over time and space, that species in- 
teract with hundreds and perhaps thousands of other 
species, and that each of these populations may vary 
uniquely or in concert, the prospect of predicting oc- 
currence can be daunting. Yet, the chapters in this 
section demonstrate that modelers have created 
methods that do indeed predict occurrence. 

A review of the methods used to model species oc- 
currences and abundances is too broad a topic for this 
brief introduction and is larger yet when accuracy as- 
sessment of results is included. We therefore introduce 
the chapters in this section, grouped into general 
themes, without delving into details. In our conclu- 
sions, we provide detail on selected topics and high- 
light areas requiring further research, a list that is by 


no means exhaustive. 


265 


266 PREDICTING SPECIES OCCURRENCES 


Modeling Tools and Accuracy Assessment: 
The Symposium 


A wide variety of topics was included in this section of 
the symposium. Most chapters include an assessment 
of the accuracy of demonstration results, but three 
chapters (i.e., Fielding, Chapter 21; Pearce et al., 
Chapter 32; and Schaefer and Krohn, Chapter 36) ad- 
dress assessment directly. Robertsen et al. (Chapter 
34) use standard modeling methods, then use the re- 
sults to explore management questions. The remaining 
fifteen chapters address modeling tools and methods. 


Modeling Tools and Methods 


Several authors reported on their efforts to improve 
modeling methods within the context of methods that 
are commonly used. Henebry and Merchant (Chapter 
23) review the difficulties in modeling the occurrence 
of species in space and time, and suggest pathways for 
the future. Elith and Burgman (Chapter 24) compared 
the performance of four methods of species occur- 
rence modeling (i.e., an envelope method of defining 
habitat, generalized linear models, generalized addi- 
tive models, and genetic algorithms) predicting the oc- 
currence of rare plants. They found that each model 
performed well but that the generalized linear and ad- 
ditive models predicted occurrences in novel data 
somewhat better. McKenney et al. (Chapter 31) used 
Monte Carlo methods on simulated warbler occur- 
rences to assess the effects of sample size on logistic 
regression model results. They found that fewer than 
thirty samples yielded extremely variable results, and 
more than one thousand samples yielded very precise 
results. The results of McKenney et al. (Chapter 
31) highlight the need for quantitative assessment— 
qualitatively, a map created from thirty samples and 
another from one thousand samples may both appear 
reasonable. Hartless et al. (Chapter 39) describe a fac- 
torial design they recommend for use when conduct- 
ing individual-based modeling and demonstrate the 
technique while modeling white-tailed deer (Odo- 
coileus virginianus) occurrence in Florida. We believe 
their methods could be used by those employing many 
modeling methods. Finally, Dunham et al. (Chapter 
26) demonstrate how logistic regression can be ap- 
plied to watersheds (considered patches in their appli- 


cation), with predictions of presence/absence con- 
strained by water flow, for two fish species. 

As methods of spatial modeling of species occur- 
rence mature, modelers are encouraged to move away 
from methods that either violate underlying assump- 
tions of statistical techniques or lead to a loss of infor- 
mation. Autocorrelation in animal occurrence data or 
predictor variables (e.g., elevation or climate) violates 
assumptions of independence of points in common re- 
gression techniques. Henebry and Merchant (Chapter 
23) review the errors introduced by inattention to 
autocorrelated data, such as elevated significance esti- 
mates. Cablk et al. (Chapter 37) also review the effects 
of autocorrelation on model results and provide an al- 
ternative method, classification and regression trees. 
Cablk et al. (Chapter 37) use tree regression to corre- 
late vertebrate richness in Oregon using a suite of en- 
vironmental variables (others with a similar approach 
include O'Connor et al. 1996; Wickham et al. 1997; 
Boone and Krohn 2000a) and assess those models 
with an intriguing use of straightforward spatial sta- 
tistics. Correlations between the environment and 
species richness tend to be at broad scales (Cablk et 
al., Chapter 37), making their uses in conservation dif- 
ferent from those of detailed species predictions. 
However, these analyses can give us early indications 
of how a suite of vertebrates may respond to broad- 
scale environmental changes, such as global warming. 
Klute et al. (Chapter 27) encourage the use of an au- 
tologistic regression model to model autocorrelated 
presence/absence data rather than use of the typical lo- 
gistic regression. They describe the modified model 
and its autocovariance term. 

Guisan (Chapter 25) encourages those using ordi- 
nal data (e.g., absent, rare, common, and abundant) in 
species modeling not to simply collapse the data to 
presence/absence and use logistic regression. Instead, 
modelers of such data should use methods that make 
use of these semiquantitative data (e.g., the propor- 
tional odds model). In his demonstration set, Guisan 
(Chapter 25) showed that the semiquantitative analy- 
ses yielded results on par with logistic results but with 
the added value of semiquantitative estimates of abun- 
dance. Violation of assumptions of normality in abun- 
dance data is another reason researchers will collapse 
data to binomial responses and use logistic regression, 
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or use linear regression inappropriately. Abundance 
data often exhibit highly skewed distributions, with 
zeros (i.e., not found) occurring most frequently. Jones 
et al. (Chapter 35) demonstrates the use of Poisson re- 
gression, which is appropriate for highly skewed data, 
and, as with the semiquantitative method introduced 
by Guisan (Chapter 25), abundances are included in 
the model results. Further, as occurrence counts in- 
crease and the distributions approach normality, Pois- 
son regression remains appropriate, allowing a mod- 
eler to use a single model form (Jones et al., Chapter 
35). Johnson and Sargeant (Chapter 33) provide a 
means of using smoothing techniques on relatively 
sparse presence/absence data associated with atlases to 
create models showing relative probability of occur- 
rence. Atlas maps were modified using simple smooth- 
ing methods, as well as methods using additional habi- 
tat information, with some evidence that the 
smoothed maps yielded better predictions than the 
original atlas data. Finally, Smith and Jenks (Chapter 
38) contrast maps showing presence/absence for se- 
lected small mammals of South Dakota using typical 
species/habitat association methods with maps show- 
ing relative abundance, created based upon habitats in 
which trapped mammals had occurred. 

In a review, Fielding (Chapter 21) noted that train- 
ing sites must be representative of areas in which 
models will be used (see also Krohn 1996). A similar 
concern is outlined by Rotenberry et al. (Chapter 22), 
where their original effort used Mahalanobis distances 
to model sparrow habitat, a method that assesses 
habitat quality of a given point by comparing its at- 
tributes to the centroid of those attributes for occu- 
pied habitats. The method performs well when train- 
ing and testing sets are similar. However, if habitat 
vastly improves at a site, for example, Mahalanobis 
distance methods may show the habitat as lower qual- 
ity, because habitat attributes have moved away from 
the centroid of (relatively poorer) habitat that had 
been occupied. Rotenberry et al. (Chapter 22) recom- 
mend an alternative termed Pearson's planes of closest 
fit (e.g., Collins 1983), which uses linear relationships 
with habitat variables that are consistently important. 

Artificial neural network analyses and genetic algo- 
rithms are provided as alternatives to traditional 
methods. These methods, in addition to evolutionary 


computation (e.g., Fogel 2000), address optimization 
problems by mimicking biological systems. Neural 
networks (Lek and Guégan 1999), which mimic learn- 
ing processes, have several advantages over methods 
such as multiple linear regression (e.g., no prior as- 
signment of a model form and no assumed distribu- 
tion of data) (Lusk et al., Chapter 28). Lusk et al. use 
artificial neural networks to predict bobwhite quail 
occurrences in Oklahoma, with some success. Elith 
and Burgman (Chapter 24) include a genetic algo- 
rithm in the suite of models they compared. Genetic 
algorithms generate potential solutions to optimiza- 
tion problems—such as optimizing agreement between 
species occurrence and environmental variables—by 
emulating genetic processes (e.g., mutations, cross- 
overs, combinations). Many solutions are tested, and 
those most successful spawn related solutions that are 
in turn tested, while others are discarded, ultimately 
leading to optimal solutions (Fogel 2000). 


Accuracy Assessment 


Evidence is now strong that modelers are more adept 
at predicting occurrences and abundances of species 
that are common than those that are rare (and associ- 
ated attributes, e.g., secretive, quiet, using few habi- 
tats, or inhabiting remote areas) (Edwards et al. 1996; 
Boone and Krohn 1999; Garrison and Lupo, Chapter 
30; Hepinstall et al., Chapter 53; Karl et al., Chapter 
51; Schaefer and Krohn, Chapter 36). The stochastic 
component associated with whether or not a site is oc- 
cupied by a rare species is larger than for a common 
species, leading to predicted distribution maps with 
larger commission errors. Schaefer and Krohn (Chap- 
ter 36) partition this commission error into actual 
error due to mistaken modeling form, and apparent 
error due to incomplete field surveys. The authors 
show that for the birds that bred within Maine, the 
apparent error was high for rare species inhabiting 
smaller test sites, suggesting field surveys were indeed 
incomplete (for direct evidence of incomplete surveys, 
see Nichols et al. 1998b). Incomplete surveys can 
reflect low detectability of species. Recognizing that 
detectability affects the number of surveys required 
and the power of statistical analyses, Stauffer et al. 
(Chapter 29) modified power analysis methods to 
incorporate detectability explicitly. They also showed 
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that to characterize occurrence of marbled murrelet 
(Brachyramphus marmoratus) within a region, single 
surveys in many places are more beneficial than re- 
peated surveys at given sites (Link et al. 1994; Maurer 
1994). Last, Garrison and Lupo (Chapter 30) provide 
an example, where their efforts to model species 
ranges based upon habitat maps and associations were 
more successful for common species than rare species. 

Fielding (Chapter 21) warns against overfitting 
species occurrence data used in training models and 
the importance of assessment to detect overfitting. In- 
cidences or abundances in training data can often be 
modeled well, but modelers may describe not only the 
ecological relationships underlying patterns, but also 
individual variations in data points, including noise. 
An example is provided by Lusk et al. (Chapter 28). 
We agree with the authors that artificial learning tech- 
niques such as neural networks will become more im- 
portant in future modeling work. However, Lusk et al. 
(Chapter 28) also show how artificial neural networks 
can overfit data (r = 0.96 in training, describing quail 
occurrence in Oklahoma, r = 0.42 in testing). 

There is a consensus that multiple assessment terms 
should be reported when describing the accuracy of 
species occurrence or abundance models (Guisan, 
Chapter 25; reviewed in Henebry and Merchant, 
Chapter 23). Simple overall errors should be reported 
(e.g., Johnson and Sargeant, Chapter 33) as well as 
commission and omission errors (e.g., Schaefer and 
Krohn, Chapter 36). These measures provide straight- 
forward ways of understanding model performance 
but are less helpful when comparing the relative suc- 
cess of models of different species (Boone and Krohn 
1999). Guisan (Chapter 25) reviews statistics such as 
Kappa, which may be used to report accuracy in con- 
tingency tables. When reporting logistic regression as- 
sessments, instead of the traditional reporting of logis- 
tic regression probabilities less than 0.5 as absent and 
greater than or equal to 0.5 as present, receiver oper- 
ating characteristic (ROC) plots (Fielding and Bell 
1997; Elith and Burgman, Chapter 24; Fielding, 
Chapter 21; Guisan, Chapter 25; Pearce et al., Chap- 
ter 32) quantify accuracy for the full range of proba- 
bilities. Finally, Pearce et al. (Chapter 32) provide a 
framework for reporting errors in presence/absence 
models. They encourage the use of discrimination his- 


tograms (which quickly identify how well presences 
and absences are being modeled), ROC plots, and cal- 
ibration plots (which reflect whether the relative oc- 
currence of species are being modeled well). Statistical 
measures are also proposed, which reflect agreement, 
biases in modeling, and spread (Pearce et al., Chapter 
32). 


Application and Management 


Lastly, we cite the modeling effort by Robertsen et al. 
(Chapter 34), in which they used traditional logistic 
regression methods to predict the probability of occur- 
rence of six species of birds in Wisconsin. They then 
linked the predicted distributions to simulated changes 
in forest cover due to harvest and fire and loss of habi- 
tat from development. Their efforts remind us of the 
importance of linking spatial modeling of species pre- 
dicted occurrences to management questions. 


Conclusions and the Future 


In general, we believe that the methods used to model 
the occurrence and abundance of species are more de- 
veloped than those used to assess those predictions. 
The number of chapters in this part of the volume that 
describe methods attests to this. Methods are improv- 
ing rapidly, but limitations and research questions re- 
main. As examples, researchers seek to know how an- 
imals perceive their surroundings, defining both the 
coarse- and fine-scale limits upon the distribution of 
species. Once researchers understand what animals 
perceive, they must create metrics and, if possible, 
maps that reflect that perception. Species/habitat rela- 
tionships are scale (grain and extent) dependent, so 
modelers must work at the scale most appropriate for 
the issue at hand. In some cases, this will include 
working at more than one scale. Modelers are also 
limited by the continued lack of natural history infor- 
mation for many species. The habitat associations, for 
example, of common game species in North America 
have been described in some detail, but many ques- 
tions remain; our uncertainty of habitat associations 
of nongame species is often large (Smith and Catan- 
zaro 1996; Karl et al. 1999). How selection within a 
species varies by sex, season, and social status are 
often unknown (Morrison et al. 1992). Lastly, how 
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species relate to even simple landscape metrics model- 
ers calculate (e.g., patch size, patch count, area-to- 
perimeter ratio) is unknown for most species. 

. The spatial nonindependence of data used in mod- 
eling continue to limit our progress, especially if ig- 
nored by modelers. The implications are reviewed in 
this volume by Henebry and Merchant (Chapter 23) 
and Cablk et al. (Chapter 37), and so we will avoid 
repetition and simply add our voices to those saying 
nonindependence cannot be ignored. Instead, we cau- 
tion against a reflexive removal of spatial patterning 
or its assignment to a stand-alone coefficient. In some 
applications, modelers seek to create an optimum pre- 
dictive surface or to identify local effects on the re- 
sponse, and here, spatial pattern should be removed. 
In other applications, the structure of the spatial pat- 
tern itself is of interest (Legendre 1993). Klute et al. 
(Chapter 27) demonstrate that coefficients of predic- 
tor variables decline when an autocovariance coeffi- 
cient is included, but that coefficient can mask inter- 
esting spatial patterns. Spatial autocorrelation in 
species occurrences may be due to the underlying pop- 
ulation dynamics of the species in which partitioning 
variation in an autocovariance coefficient is appropri- 
ate. In contrast, spatial autocorrelation may indeed be 
due to a correlation with a spatially autocorrelated 
predictor variable, such as elevation. Here, reassigning 
variation from predictors to an autocovariance coeffi- 
cient would be misleading. A method of contrasting 
between these sources of variation has been proposed 
by Borcard et al. (1992) and demonstrated in richness 
analyses by Boone and Krohn (2000b). 

With today's computers and the ready availability 
of statistical packages, models predicting the occur- 
rences of species can be generated in great numbers. 
But which model to use? The authors of the chapters 
consider a variety of approaches and discuss a number 
of the issues related to this question. A useful ap- 
proach not stressed is to make quantitative compar- 
isons between models, comparing relative measures of 
fit, such as AIC (Akaike's Information Criterion) in- 
dices. For a complete treatment of this topic, see Burn- 
ham and Anderson (1998) and Anderson et al. (2000). 

This brief introduction can only stress that many 
questions in modeling species occurrence remain. Re- 
searchers should strive for more rigorous definitions 


of the range limits of species and to gain a general pic- 
ture of their dynamics through space and time (Krohn 
1996). The utility of biogeographic relationships in 
species modeling across scales (extent and grain) is be- 
ginning to be explored (e.g., Root 1988c; Walker 
1990; Scott et al. 1993; Venier et al. 1999). The effects 
of population dynamics upon predictions of species 
Occurrence require more insight. As examples, model- 
ers create models for populations that may be at or 
near carrying capacity and then apply them to areas 
that are understocked, or vice versa. Such methods 
may not yield accurate predictions of species occur- 
rence (Krohn 1996). Also, how well species models 
apply to both source populations and sink popula- 
tions in a given species remains to be explored (e.g., 
Pulliam and Danielson 1991). 

Methods of assessment are lagging behind advances 
in methods of modeling. Simply put, it is both more 
straightforward and more rewarding to create pre- 
dicted occurrences than it is to assess their accuracy. 
However, progress is being made in assessment, as re- 
flected in this volume. A sometimes-neglected compo- 
nent of assessment is a consideration of risk, such as 
the cost of classifying habitats incorrectly (Green 
1979; Krohn 1996; Smith and Catanzaro 1996; Field- 
ing, Chapter 21). We also encourage further research 
into the assessment of individual layers used in species 
occurrence modeling, as well as how errors propagate 
in the modeling process (e.g., Haining and Arbia 
1993; Krohn 1996). It remains important to assess 
species occurrences and abundances using a variety of 
techniques and to report a variety of results (e.g., re- 
sampling efforts, assessment with new data, expert 
opinion, reporting simple errors and statistics, pre- 
senting graphical summaries). Experience and analyses 
demonstrate that assessments using independently col- 
lected data require a large number of samples to yield 
a useful level of power. In general, methods for assess- 
ing individual sites appear to be reasonably robust. 
Methods for assessing species occurrence across a re- 
gion appear weaker. Methods of assessing the models 
for a suite of species across a region are still in early 
development. Lastly, methods used to compare model- 
ing efforts between regions are poorly developed. 
When modeling a suite of species, such as in gap 
analysis (Scott et al. 1993), careless comparisons of 
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errors between sites or between states can be mislead- 
ing. Sites within a region, even if in close proximity, 
can have different species compositions due to cli- 
mate, historic, and other space-specific effects. 

We reiterate a point already made by Fielding and 
Bell (1997): tests of model accuracy and the metrics 
used to report those tests must be related to the in- 
tended use of the model. As an example, commission 
errors may reflect previously occupied habitat when 
modeling a species with a shrinking geographic range. 
When modeling a suite of species, calibration plots 
(Pearce et al., Chapter 32) may be most useful, identi- 
fying whether the relative abundances of the species 
are being predicted well. 

Methods of assessment used in the future must rec- 
ognize that species are not equally likely to be suc- 
cessfully modeled, depending upon detectability, for 
example. We used these differences a priori to model- 
ing to essentially form hypotheses regarding our as- 
sessments (the Likelihood Of Occurrence Ranks 
[LOORs] of Boone and Krohn 20002). Fielding 
(Chapter 21) promotes the creation of assessment 
techniques that remove the effects of differences in 
detectability. One method may prove more useful, but 
at this stage of knowledge, research on both fronts 
should continue. 

Species that are adapted to produce a few highly de- 
veloped young (i.e., K-selected; reviewed in Pianka 
1970) generally compete well in a specific set of habi- 
tats. That habitat specificity allows modelers to predict 
species occurrences reasonably well (although if the 
species is rare, Commission errors may remain high). In 


contrast, species that produce many less-developed 
young (r-selected species) have populations that can be 
irruptive. Habitats approaching the species’ ideal may 
not contain the species in a given year, but in the fol- 
lowing year, even relatively poor habitats may be occu- 
pied (Krohn 1992). Modelers have certainly modeled 
species on both ends of the continuum successfully. 
Our contention, however, is that commission errors of 
models of K-selected species are most often associated 
with the relative abundance of the species, whereas the 
commission errors associated with r-selected species 
are associated with the irruptive populations. In prac- 
tice, assessments of r-selected species may yield more 
variable results than those for K-selected species. 
Clearly research on this issue of modeling K- versus 
r-selected species is needed. 

It appears there is no unifying method that will 
allow us to successfully model the occurrence of all 
species. Each species can present its own complexities 
that must be understood and overcome. However, 
methods for modeling are improving as are methods 
of assessing the accuracy of predictions. Increased co- 
operation between disciplines and the expanding in- 
formation base bode well for future efforts. Modelers 
must mesh the information and skills geographers can 
bring to modeling species occurrence, the analyses of 
pattern and change that landscape ecologists add, the 
knowledge of the ecological relationships of individual 
species that wildlife scientists add, the rigor of analy- 
ses that statisticians provide, and the knowledge that 
managers bring to use our predictions to improve con- 
servation efforts. 


CHAPTER 


21 


What Are the Appropriate Characteristics 
of an Accuracy Measure? — 


Alan H. Fielding 


cologists who predict the distribution of wildlife 
have been exposed to considerable debate about 
the relative merits of a variety of statistical methods 
(e.g., logistic regression versus discriminant analysis). 
There has been much less debate about the appropri- 
ate ways to measure our prediction errors (Fielding 
and Bell 1997). The problem appears to be that we 
have focused our discussions on the statistical assump- 
tions of the predictive techniques (fitting the data to 
the model) and have largely ignored the best ways to 
assess the prediction accuracy. Usually the default op- 
tions, offered by the software, are all that are re- 
ported. In fact, there are a number of alternative ap- 
proaches to accuracy assessment, including those 
commonly used with machine learning (ML) models 
(Fielding 1999). Because machine-learning methods 
are not predicated on the presumption of well-defined 
probability density functions there has been little in- 
terest in the development of significance-testing meth- 
ods. This means that accuracy assessments can use 
methods that are not constrained by normality or 
maximum-likelihood considerations. Instead of signif- 
icance tests, models based on machine-learning meth- 
ods are frequently validated by testing them on inde- 
pendent data sets. 
This discussion is restricted to a consideration of the 
performance of a generalized predictive technique 
called a classifier that allocates cases to a small number 


of predefined classes. This definition excludes unsuper- 
vised clustering techniques and methods that predict 
the value of a continuous variable but includes multi- 
variate techniques such as logistic regression, discrimi- 
nant analysis, and classification and regression trees 
(CART). It also includes more novel methods, such as 
artificial neural networks (see Boddy and Morris 
[1999] for a review of their ecological applications). 

It is useful to begin with two fundamental ques- 
tions. First, why are we making the predictions? For 


example, is it to gain some insight into the ecology of 
an organism, or is it to predict the future or current 
distribution? Second, what level of accuracy do we ex- 
pect from our model? Ecologists work in a relatively 
knowledge-poor environment where stochastic events, 
such as historical “accidents” and opportunism, can 
play a central role. Even if the real variables that con- 
trol the distribution of a species can be identified they 
probably can't be measured directly. Therefore, we 
should not be too surprised if predictions lack accu- 
racy; indeed, perhaps we should be suspicious if our 
accuracy is very high. 

Imagine a scenario in which you have been asked to 
predict the future distribution of a species following 
some proposed environmental impact. The accuracy 
of your predictions is questioned at a planning in- 
quiry. How confident are you that you can defend 
your predictions? Amongst the many questions that 
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may be asked in such circumstances is “what do you 
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understand by accuracy?” One definition is that it is 
the nearness of the predicted values to their actual val- 
ues. However, this leads to a follow-up question about 
the nature of the actual values. Unfortunately, in 
wildlife distribution studies we rarely have a gold 
standard against which we can judge our predictions. 
If we were predicting the status of a banknote (forged 
versus genuine) or the gender of an individual, there is 
no ambiguity. In wildlife distribution studies, we have 
the added complications of scale, time, detectability, 
and biological variability. We must, therefore, begin 
by setting very clear goals against which we can judge 
our predictions. 

Suppose that we wished to predict the distribution 
of the peregrine falcon (Falco peregrinus) nest sites in 
Britain. What is the appropriate reality against which 
predictions should be judged? Is it the current, histori- 
cal, future, actual or potential distributions? This is im- 
portant because Johnson and Krohn (Chapter 13) have 
demonstrated that we need to understand the popula- 
tion dynamics of the species at the time that the data 
used to develop the model were collected. Similarly, 
how should a nest site be categorized if it is only occu- 
pied when an adjacent golden eagle (Aquila chrysaetos) 
nest site is not occupied? In addition, what is the ap- 
propriate scale for such a study? At one extreme, we 
may be 10 meters out in our prediction of a nest loca- 
tion; at another, we may fail to predict the presence of 
a nest in a 4-square-kilometer square. Surely the sec- 
ond of these is the more serious error. Changing the 
scale of a study will influence our accuracy and move 

ur focus away from the decisions made by individuals 
(behavioral ecology) to the relationships between the 
climate and the landscape and the broad scale distribu- 
tion of a species (biogeography). 


The Nature of a Classifier 


It is important to understand that the separation and 
identification achieved by all classifiers is constrained 
by the application of some algorithm to the data. For 
example, many statistical methods assume that classes 
are linearly separable. In a linear discriminant analy- 
sis, class membership is assigned following the appli- 
cation of some threshold to a score calculated from 
Xb;x;, where x; is the score for case j on predictor x;. 


TABLE 21.1. 


A comparison of statistical and pragmatic issues that should 
be considered during classifier development.? 


Statistical issues Pragmatic issues 


Model accuracy 
Generalizability 
Model complexity 


Model specification 
Parameter estimation 
Diagnostic checks 
Model comparisons Cost 


aBased on Table 1 in Hosking et al. 1997. 


The problem then reduces to one of estimating the val- 
ues of the coefficients (b;) using a maximum-likelihood 
procedure that depends on an assumed probability 
density function. Because the values of these coeffi- 
cients are unknown, but presumed to be fixed, the em- 
phasis moves toward the structure of the separation 
rule rather than the individual cases. Using statistical 
classifiers, we can usually obtain confidence intervals 
for our parameter estimates. However, these are con- 
ditional on having specified the appropriate statistical 
model. The classification accuracy of a technique such 
as logistic regression is largely independent of the 
goodness of fit (Hosmer and Lemeshow 1989). Part of 
the problem is that classification accuracy is sensitive 
to the group sizes (prevalence), and cases are more 
likely to be assigned to the largest group. Thus, the 
criteria by which a logistic regression should be 
judged are those relating to the statistical model rather 
than the classification accuracy. 

Perhaps it is time to rethink our approaches to the 
application of classifiers to ecological problems. In 
particular, we could move the emphasis away from 
statistical concerns, such as the goodness of fit, to 
more pragmatic issues (Table 21.1), such as accuracy 
and cost. 


Some Causes of Prediction Failure 


Using the pragmatic criteria in Table 21.1, it is possi- 
ble to examine situations under which a classifier is 
likely to be inadequate. 


1. The form of the classifier is too complex and over- 
fits the data. This tends to arise when the parame- 
ters:cases ratio exceeds some desirable limit and we 
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Figure 21.1. A decision boundary (dashed line) that over-fits 
the data. This boundary is unlikely to separate other samples 
with the same accuracy. 


begin to fit the random noise in the data (Fig. 
21.1). This will lead to poor generalization. 

2. The form of the classifier is too simple or has an in- 
appropriate structure. For example, classes may 
not be linearly separable or important predictors 
have been excluded (Conroy et al. 1995). This will 
reduce accuracy. 

3. The initial class membership is incorrect, or 
“fuzzy.” Most classifiers assume that class mem- 
bership is known without error. Obviously, if 
classes are not clearly defined, it becomes more dif- 
ficult to apply a classification procedure, and any 
measure of accuracy is likely to be compromised 
(see Magder and Hughes [1997] and Magder et al. 
[2000] for examples using logistic regression). Al- 
though the assumption of known groups is reason- 
able for discrete classes such as gender, it is ques- 
tionable in other situations such as wildlife- 
distribution models. It is easy to imagine scenarios 
in which the dichotomization of cases could be 
compromised by ecological difficulties. For exam- 
ple, we should expect under-recording for rare or 
cryptic species and those that live in difficult habi- 
tats such as the canopy of a tropical rain forest. 
Boone and Krohn (1999) and Schaefer and Krohn 
(Chapter 36) have suggested that we can use a pri- 
ori information on features such as detectability to 
place the accuracy of our predictions into an ap- 
propriate context. If cases are initially misclassi- 
fied, this is likely to affect the number of prediction 
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errors. More importantly, misclassified cases will 
influence the structure of the classifier, leading to 
bias or classifier degradation; Maurer (Chapter 9) 
showed how errors in the predictor variables, 
when combined with errors in the response vari- 
able (presence/absence in these examples), can pro- 
duce a systematic bias. For example, misclassified 
cases are likely to be outliers on some predictors, 
but their influence will depend on the classifier. In a 
discriminant analysis, outliers can have large ef- 
fects because of their contribution to the covari- 
ance matrix. Conversely, in a CART analysis, their 
influence on the classification rules may be trivial 
because they are assigned to their own branch (Bell 
1999» 

4. Training cases (those used to produce the classifica- 
tion rule) may be unrepresentative. If they are, it 
will lead to bias and poor performance when the 
classifier is applied to new cases. Careful sampling 
designs should reduce this problem, but such bias 
may unavoidable if there is significant regional and 
temporal variability in the habitat relationships. 


Characteristics of an Ideal 
Accuracy Index 


When we make predictions about the distribution of 
wildlife, we gain little if all that we predict is the dis- 
tribution of the animals used to produce the predic- 
tions (Beutel et al. 1999; Verbyla and Litvaitis 1989). 
To beamost helfe oug predioonssmustasthugte 
general, and unbiased. An essential component of any 

index of prediction accuracy is that it was obtained 
from data that are independent of those used to gener- 
ate the prediction rules. In other words, it is important 
to have some idea about how well the classifier will 
perform with new data. This is needed because the ac- 
curacy achieved with the original data is often much 
greater than that achieved with new data (Henery 
1994). When a classifier is tested, its future error rate 
is estimated from a confusion matrix (a cross-tabula- 
tion of the number of cases correctly and incorrectly 
assigned to each of the classes) that was obtained, 
hopefully, from randomly selected members of 
the population. It can be shown that the expected 
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performance of a classifier is linked to the size of the 
training set and that larger test sets reduce the vari- 
ance of the error estimates. Unfortunately, the number 
of available test sets is frequently small so error rate 
estimates are normally imprecise. 

The two data sets needed to develop and test pre- 
dictions are known by various synonyms. The terms 
training and testing data are used here. The problem 
now becomes one of finding appropriate training and 
testing data. Ecologists seem to have paid little atten- 
tion to the range of available methods or how the 
choice may influence the estimated error rates. One 
exception is Verbyla and Litvaitis (1989), who briefly 
reviewed a range of partitioning methods in their as- 
sessment of resampling methods for evaluating classi- 
fication accuracy. 

Resubstitution (reuse of the training data) is the 
simplest way of testing the performance of a classifier. 
Unfortunately, this provides a biased assessment of the 
classifier’s future performance, possibly because the 
form of the classifier has been determined by some 
model-selection algorithm (e.g., stepwise variable se- 
lection). An inevitable consequence of model selection 
processes is that the final model tends to “overfit” the 
training data because it has been optimized to deal 
with the nuances in the training data (Fig. 21.1). This 
bias may still apply if the same set of “independent” 
testing data were used to verify the model selection 
(Chatfield 1995). 

The best assessment of a classifier's future value is 
to test it with some truly independent data—ideally, a 
sample collected independently of the training data 
(prospective sampling). Because this is often difficult, 
a common practice is to split or partition the available 
data to provide the training and the “independent” 
testing data. Unfortunately, partitioning the existing 
data is not a perfect solution since it is less effective 
than collecting new data. In addition, the inevitable 
reduction in the size of training set will usually pro- 
duce a corresponding decrease in the classifier's accu- 
racy. There is, therefore, a trade-off between having a 
large test set that gives a good assessment of the classi- 
fier’s performance and a small training set that is likely 
to result in a poor classifier. 

The simplest partitioning method splits the data 
into two unequally sized groups. The largest partition 


is used for training. Huberty (1994) provided a heuris- 
tic for determining the ratio of training to testing cases 
that is based on the work of Schaafsma and van 
Vark (1979). This heuristic suggests a ratio of [1 + 
(p — 1)12]H, where p is the number of predictors. For 
example, if p = 10, the testing set should be 1/[1 + V9], 
or 25 percent of the complete data set. An increase in 
the number of predictors is matched by an increase in 
the proportion of cases needed for training. For exam- 
ple, if there are seventeen predictors, 80 percent of 
cases are needed for training. In reality, this type of 
partitioning is a special case of a broader class of com- 
putationally more intensive approaches. The first of 
these, k-fold partitioning, splits the data into k equal 
sized partitions. Each is used sequentially as a test set, 
whilst the remaining k — 1 sets are used for training. 
This yields & performance measures. The overall per- 
formance is then based on an average over these k test 
sets. 

Leave-One-Out (L-O-O), and the related jackknife 
procedures, give the best compromise between maxi- 
mizing the size of the training data set and providing a 
robust test of the classifier. In both of these methods 
each case is used sequentially as a single test sample, 
while the remaining z — 1 cases form the training set. 
Alternatively, a large number of bootstrapped samples 
(random sampling with replacement) may be used for 
testing. Efron (1983) and Efron and Tibshirani (1997) 
developed their 0.632 estimator to correct the slight 
bias in error estimates obtained from bootstrapped 
samples. 


Incorporate Costs 


The index must take into account the total cost of the 
prediction errors. Three categories of cost apply to all 
classifiers. First, there are the predictor costs, which 
may be complex if some costs are shared between pre- 
dictors (Turney 1995). For example, most field data 
will share the overheads associated with transporta- 
tion to the field. These costs should be considered dur- 
ing the classifier's development (e.g., Schiffers 1999: 
Turney 1995). Second, there are computational COStS, 
which include the time spent preprocessing the data 
and learning how to use the classifier and the com- 
puter resources that are needed to run the classifier. 
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TABLE 21.2. 


Three examples of confusion matrices? and six derived 
accuracy measures. 


A B Cc 

+ - + E + - 

+ 95 20 80 5 95 120 

- 5 80 20 95 5 780 

CCRb 0.875 0.875 0.875 
Sensitivity 0.950 0.800 0.950 
Specificity 0.800 0.950 0.867 
PPPc 0.826 0.941. 0.442 
Kappa 0.750 0.750 0.540 
NMI 0.480 0.480 0.453 


aColumns are actual classes, rows are predicted. 

bCCR = Correct Classification Rate. 

€PPP = Positive Predictive Power (true positives/[true positives + false 
positives]). 


Finally, there are the costs associated with the misclas- 
sified cases. This section is concerned only with the 
latter. 

In a simple presence/absence classifier, we can make 
two mistakes (errors of commission and omission) 
that may have different costs. For example, do the 
measures shown for the three confusion matrices in 
Table 21.2 reflect the ecological “value” of the level of 
predictive accuracy? The answer must be no, because 
they do not identify all of the important differences 
between the three matrices. In particular, they do not 
take account of the misclassification costs. 

Measures derived from confusion matrices assume 
that both error types are equivalent. There are situa- 
tions, for example in a conservation-based model, 
where this assumption can be questioned. If a model is 
used to define protected areas, failure to correctly pre- 
dict positive locations will be more “costly” (in con- 
servation terms) than would commission errors, or in 
other words, omission cost (OC) is greater than com- 
mission cost (CC). Although these inequalities can be 
compensated for partly by the choice of error measure 
and allocation threshold (Fielding and Bell 1997), it is 
possible to adopt other approaches, such as a cost ma- 
trix that weights errors prior to the calculation of 
model accuracy. For example, we could assign weights 
by taking into account perceived threats to the species. 
Although it can be argued that the allocation of costs 


must be subjective, unless there are clear economic 
gains and losses, a failure to explicitly apply costs 
equates to the implicit application of equal costs that 
can rarely be justified. 

The cost associated with an error depends upon the 
relationship between the actual and predicted classes. 
This is because misclassification costs may not be re- 
ciprocal; for example, classifying a nest site as a non- 
nest site may be more costly than the reverse. Lynn et 
al. (1995) used a matrix of misclassification costs to 
evaluate the performance of a decision-tree model for 
the prediction of landscape levels of potential forest 
vegetation. Their cost structure was based on the 
amount of compositional similarity between pairs of 
groups. 

If costs are applied, the aim changes from one of 
minimizing the number of errors to one that mini- 
mizes the cost of the errors. Consequently, the imposi- 
tion of costs complicates how the classifier's perform- 
ance should be assessed. For example, we now need to 
take account of the total costs that could be incurred 
from the imposition of a trivial rule that assigns cases 
to the least costly class. 


Prevalence Independent 


The value of the index should be independent of the 
proportion of cases in each group (prevalence). This 
is difficult because prevalence can affect the classifier 
in a number of ways. First, there are ecological im- 
plications of low prevalence, possibly arising from 
the important differences between rare and common 
species, some of which will influence the model's per- 
formance (Boone and Krohn 1999). Karl et al. 
(Chapter 51) showed, using sampling simulations, 
how commission errors decreased with increasing 
sample size, while omission error rates remained 
constant but with increased precision. Second, there 
are classifier-specific problems, such as those caused 
by the use of prior probabilities when assigning cases 
to groups (Titus et al. 1984; Manel et al. 1999). Fi- 
nally, some confusion-matrix-derived measures are 
sensitive to the prevalence (p) of positive cases. For 
example, even the simple correct classification rate 
(CCR) is affected by the prevalence since CCR = 
p x sensitivity — (1 — p) x specificity (Ruttiman 
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1994), where sensitivity is the ratio of correctly pre- 
dicted positives to the total number of positive cases, 
and specificity is the ratio of correctly predicted neg- 
ative cases to the total number of negative cases. 
Consequently, it is important to avoid these potential 
pitfalls when a model's performance measures are in- 
terpreted in an ecological context (Fielding and Bell 
1997). 


Threshold Independent 


Although there are better measures than the simplis- 
tic overall percentage correct (Fielding and Bell 
1997), most fail to make use of all of the available 
information. Most classifiers assign cases to groups 
following the application of a threshold (cut-point) 
to some score, for example a discriminant score. In- 
evitably, this dichotomization of the raw score re- 
sults in the loss of information; in particular, we do 
not know how marginal the assignments were. For 
example, using a 0-1 raw score scale and a 0.5 
threshold, cases with scores of 0.499 and 0.501 
would be assigned to different groups. In addition, 
unequal class sizes can influence the allocations, with 
cases being more likely to be assigned to the larger 
class. There are several strategies that we can use to 
deal with this bias. In particular, we can adjust the 
prior probabilities of class membership or adjust the 
assignment threshold. Similarly, if we have decided 
that omission errors are more serious than commis- 
sion errors, the threshold can be adjusted to decrease 
the omission rate at the expense of an increased com- 
mission error rate. Few ecological studies appear to 
have addressed this problem (some exceptions are 
Capen et al. 1986; Fielding and Haworth 1995; 
Manel et al. 1999; Pereira and Itami 1991). Other 
thresholds could be justified that are dependent on 
the intended application of the classifier, for example 
a “minimum acceptable error” or omission criterion. 
However, adjusting the threshold does not have con- 
sistent effects on confusion-matrix-derived measures. 
For example, lowering the threshold increases sensi- 
tivity but decreases specificity. Consequently, any ad- 
justments to the threshold must be made within the 
context that the classifier is used. 

An alternative to threshold adjustments is to use of 


all the information contained within the original raw 
score and calculate measures that are threshold inde- 
pendent. The receiver operating characteristic (ROC) 
plot is a threshold-independent measure that was de- 
veloped as a signal-processing technique. The term 
refers to the performance (the operating characteristic) 
of an *observer" (receiver) that assigns cases into di- 
chotomous classes (Deleo 1993; Deleo and Campbell 
1990). The technique has been applied widely to clini- 
cal problems (Zweig and Campbell 1993) and there 
has been some recent interest from the ML community 
(Bradley 1997; Provost and Fawcett 1997). Marsden 
and Fielding (1999) used ROC plots when assessing 
the effectiveness of cross-island predictions of the 
habitat use by parrots on three Wallacean islands, and 
Manel et al. (1999) used them to overcome the thresh- 
old sensitivity of logistic regression when comparing 
species' distribution models. 

A ROC plot is obtained by plotting all sensitivity 
values (true positive proportion) on the y-axis against 
their equivalent (1 — specificity) values (commission 
proportion) on the x-axis (Fig. 21.2). The area under 
the ROC function (AUC) is usually taken as the index 
of performance because it provides a single measure of 
overall accuracy that is independent of any particular 
threshold (Deleo 1993). It is also invariant to prior 
class probabilities (Bradley 1997). The AUC has a 
value between 0.0 and 1.0. If the value is 0.5, the 
scores for two groups do not differ and the classifier 
would perform as well as a coin toss. Conversely, an 
AUC of 1.0 indicates no overlap in the distributions of 
the group scores and the classifier would never mis- 
classify. An AUC of 0.75 indicates that, on 75 percent 
of occasions, a random selection from the positive 
group will have a score greater than a random selec- 
tion from the negative class (Deleo 1993). 

Despite its advantages, the ROC plot does not pro- 
vide an automatic rule for class allocation. However, 
there are strategies that can be used to develop deci- 
sion rules from the ROC plot (Deleo 1993; Zweig and 
Campbell 1993). Finding the appropriate allocation 
threshold from a ROC plot depends on having values 
for the relative costs of commission and omission er- 
rors. Assigning values to these costs is complex and 
subjective, and dependent upon the context within 
which the classification rule will be used. As a guide- 
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Figure 21.2. An ROC plot for training (solid line, 75 percent of 
the data) and testing data (dashed line, remaining 25 percent). 
Four thresholds are marked: (a) 1.13; (b) 0.79; (c) 0.13; (d) 
—0.16. The data are for the golden eagle (Aquila chrysaetos) 
nest sites described in the example. 


line, Zweig and Campbell (1993) suggest that if CC is 
greater than OC, then the threshold should favor 
specificity, while sensitivity should be favored if OC is 
greater than CC. Combining these costs with the 
prevalence (p) of positive cases enables a slope (m) to 
be calculated (Zweig and Campbell 1993) from m = 
(CC/OC) x ((1 — p)/p), where m describes the slope of 
a tangent to the ROC plot if it is a smooth and para- 
metric curve. The point at which this tangent touches 
the curve identifies the particular sensitivity/specificity 
pair that should be used to select the threshold. If 
costs are ignored (or assumed to be equal), the tangent 
is simply the ratio of negative to positive cases. Zweig 
and Campbell (1993) also describe a related algorithm 
for stepped nonparametric ROC plots. 


Better Than Guessing 


Is the level of accuracy better than could be obtained 
by chance or by the application of some trivial rule 
(e.g. put all cases in one group)? If one class has a high 
prevalence, a high CCR is achieved by the simple ex- 
pedient of assigning all cases to the most common 
class. For example, if the prevalence of positive cases 
was 0.01, a CCR of 0.99 is possible if all cases are la- 
beled as negatives. Huberty (1994:105) describes a 
one-tailed test that can be used to test if a classifier’s 
performance is better than could be obtained by 
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chance. If p is low, the calculation is adjusted to en- 
able a test of improvement over the rate achieved 
using the trivial rule of assigning all cases to the nega- 
tive group. 

Kappa (K), the proportion of specific agreement, is 
often used to assess improvement over chance. Landis 
and Koch (1977) suggested that K < 0.4 indicates poor 
agreement, whilst a value above 0.4 is indicative of 
good agreement. However, K is sensitive to the sample 
size and it is unreliable if one class dominates. The tau 
coefficient (Ma and Redmond 1995) is a related meas- 
ure that depends on a priori knowledge of the preva- 
lence rather the a posteriori estimate used by K. The 
more recent Normalized Mutual Information (NMI) 
measure does not suffer from these problems, but it 
can show non-monotonic behavior (Forbes 1995). It is 
important to note that all three of these measures do 
not favor accurate prediction of positive cases (see 


Table 21.2). 


Incorporate the Context of the Predictions 


Most error assessments do not take into account the 
context of any errors. For example, Riordan (1998) 
noted that if nineteen out of twenty footprints in one 
track run were assigned to animal A, and one was as- 
signed to animal B, this was an obvious misclassifica- 
tion that could be ignored, since all the prints in one 
track set must be from the same individual. Similarly, 
it may be useful to examine the spatial pattern of 
prediction errors. Buckland and Elston (1993) dis- 
cussed how patterns in prediction errors could be 
used to infer spatial patterns of habitat suitability, 
and Fielding and Bell (1997) described two spatially 
corrected error rates that are related to the technique 
developed by Augustin et al. (1996) for incorporat- 
ing explicit autocorrelation into general linear pres- 
ence/absence models. The rationale for the spatial 
weighting is that commission errors adjacent to real 
positives may be less serious errors than commission 
errors that are more distant from a real positive. 
Similarly, it is possible that some prediction errors 
are a consequence of ecological processes such as 
interference that have not been incorporated in 
the classifier, for example, the peregrine falcon- 
golden eagle example described earlier. In such 
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circumstances, it may be possible to incorporate eco- 
logical information about territory size and spacing 
to weight some of the prediction errors. 


An Example 


The following example is used to illustrate most of 
the issues raised in the previous sections. The data 
are a subset of those used by Fielding and Haworth 
(1995). The aim is to construct a classifier that is ca- 
pable of predicting the location of golden eagle nest 
sites (current, alternative, and historical [since 
1960], n = 60) on the Island of Mull, Scotland, UK. 
The potential predictors are the habitat within each 
of 1,117 one-square-kilometer squares. These con- 
sist of the scores from eight principal components 
that retained 78 percent of the variance from the 
original twenty-one predictors. Because it is a rea- 
sonably well-known technique, discriminant analy- 
sis (SPSS for Windows, Release 9.0) was used as the 
classifier. 

The results from a range of analyses are summa- 
rized in Table 21.3. Using these data, there is very lit- 
tle difference between the results obtained using re- 
substitution priors) and 
cross-validation. Unfortunately, we have to be cau- 


(equal those using 
tious about both analyses because the large number of 
commission errors has resulted in low Positive Predic- 
tive Power (PPP) values. There is only a 12 percent 
chance that a predicted square actually contains a real 
nest. The Kappa statistic is quite low, again suggesting 
that the classifier is not performing well. If the priors 
are changed to reflect the class sizes—60/1,117 and 
1,057/1,117—the CCR rises to an impressive 94.4 
percent. Unfortunately, we are no longer able to cor- 
rectly predict any real nest sites. 

In the remainder of the analyses, three partitioning 
schemes are used. The first applies Schaafsma and 
van Vark's (1979) rule, which uses the number of pre- 
dictors to determine the proportion of cases needed 
for testing. Seventy-five percent of the cases were se- 
lected randomly for training. As might be expected, 
the classifier performed better with the training data. 
The second scheme was a fivefold partitioning. No at- 
tempt was made to retain an equal number of nests 
within each partition. Although the mean values are 


quite similar to the previous results, it is interesting to 
note the range of values obtained for these five sets. 
When the second partition was used for testing, the 
predictions were better than those obtained with the 
training data. When the fifth partition was used for 
testing, there was a large discrepancy between the 
training and testing sensitivity values. These results il- 
lustrate clearly that the performance of a classifier is 
dependent on the composition of the training and 
testing sets. 

The final partitioning scheme attempts to simulate 
prospective sampling. The island can be split into 
three geographical regions. The data from each re- 
gion were used to produce classification rules that 
were tested on the other two regions. These results il- 
lustrate some important points about accuracy as- 
sessment. For example, if the CCR is used as an 
index, it is apparent that the North Mull classifier 
does not translate well to other regions. However, if 
sensitivity is used, North Mull is an excellent classi- 
fier. This discrepancy arises because almost half of 
the cases are errors of commission. Although it is not 
possible to illustrate this here (to protect the nest site 
locations), the commission errors are clustered 
around the real nest sites. If a spatial correction is 
applied (Fielding and Bell 1997), the CCR rises to 67 
percent. When the Ross of Mull training data were 
used for training, a high CCR, combined with good 
sensitivity, was obtained. However, the classifier per- 
formed poorly with the eagle-nest-site test data. 
Fielding and Haworth (1995) demonstrated that the 
cross-region predictive success for a range of raptors 
on the island of Mull and on the adjacent Argyllshire 
mainland was dependent upon the particular combi- 
nation of species, training, and testing sets. There 
was no guarantee of reciprocity. Similar patterns of 
between-region predictive failure were also found for 
parrots on Wallacean islands (Marsden and Fielding 
1999). 

In a final series of tests, the effect of the allocation 
threshold was investigated. These tests used the classi- 
fier trained on 75 percent of the data. The results are 
summarized in Table 21.4. As expected, changing the 
threshold altered the classification accuracy. Note that 
the value of the Kappa statistic rises as the number of 


TABLE 21.3. 


Results obtained from various permutations of training and testing data sets. 


na tp? fne sensi fpe PPPf KE NMI CCR%h 
: ResubstitutionEP * aL aom 44 16 ORAS Sm 0.12 onis 0.10 70.2 
Cross Validated* * Tay 42 18 0.70 320 0.12 (I? 0.08 69.7 
ResubstitutionCP * ** TA (0) 60 0.00 3 0.00 -0.01 94.4 
75% training 855 33 aa 0975 244 I2 0.13 0.10 70.2 
25% testing 262 sil 5 0.69 81 (QNID (ONT 0.07 OU 
k fold results, k= 5 
train k = 2, 3, 4,5 874 34 13 om2 299 ORES 0.14 0.10 T24 
testk=1 243 9 4 0.69 79 0.10 0.09 0.06 65.8 
train k = 1, 3, 4, 5 895 33 16 0.67 246 012 0.12 0.08 70.7 
test k = 2 222 9 2 0.82 (Ski 0:15 0.19 OFAN, TOA 
trank = 1,2,4,5 897 33 13 0.72 240 0.12 Ons 0.10 71.8 
test k = 3 220 8 6 0.57 45 ORS 0.15 0.07 76.8 
train k = 1, 2, 3, 5 892 36 13 0.73 255 0.12 013 0.10 70.0 
testk=4 225 8 3 0.73 73 0.10 0.10 0.07 66.2 
train k = 1, 2, 3, 4 910 38 dtl 0.78 248 0.13 0.15 0.12 (als 
test k 2 5 207 6 5 0.54 63 0.09 0.06 0.03 Sm: 
Training data mean 0572 244 0.12 (ONES 0.10 lee. 
Training data std error © 0.017 4.04 0.003 0.005 0.008 0.384 
Testing data mean 0.67 14 0.12 0.12 0.08 70.4 
Testing data std error 0.050 6.410 0.014 0.022 0.025 2.480 
Cross-region predictions 
Ben More 583 26 13 0.67 162 0.14 0.13 0.07 70.0 
Test set 534 14 7 0.67 96 onis 0.16 012 80.7 
Ross of Mull 175 4 al 0.80 27 ORS 0.18 0.21 84.0 
Test set 942 28 27 0.51 178 0.14 0.13 0.06 78.2 
North Mull 359 11 5 0.69 68 0.14 QI QS 79.7 
Test set 758 41 & 0.93 312 0.12 om2 0.14 58.4 


Note: Approximately 75 percent of the data were selected randomly to form the training set, and the remainder formed the test set. These 
proportions were determined using the rule suggested by Schaafsma and van Vark (1979). 


an = number of cases 
btp = true positive 

cfn = false negative 
dfp = false positive 


esens = sensitivity (true positive fraction) 


fPPP = Positive Predictive Power 


8K = Kappa 


hCCR = Correct Classification Rate 
*EP = equal prior probabilities 


**Cross validated = Leave-One-Out testing 
***CP = class size prior probabilities 
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TABLE 21.4. 


Accuracy assessment after applying different thresholds to a discriminant score. 


Training? Testing? 
AUC = 0.798 AUC = 0.762 
Threshold tp fn fp K PPP tp fn fp K PPP 
-0.92 44 0 662 0.023 0.06 ALS) dl, 200 0.018 0.07 
-0.46 43 al 516 0.052 0.08 15 1 25/4! 0.041 0.08 
0.00 40 4 353 (04H00) — Os 13 3 120 0073 mONO 
0.46 33 atit 237 0.134 0.12 Ta 5 80 0.114 0.12 
0.92 38 6 135 0.141 0.22 9 7 41 0.199 0.18 
1.38 26 16 49 (0-165210) 1035 6 10 20 0:227 023 
1.84 14 30 28 omer 0833 4 12 10 0.222 0.29 


Note: See Table 21.3 for sample sizes. 


?tp = true positive, fn = false negative, fp = false positive, K = Kappa (figure in italics have p > 0.05). 


50.46 is the default threshold. 


commission errors declines. Unfortunately, this is at 
the expense of true positives and is a consequence of 
the low prevalence of positive locations. Selecting the 
appropriate threshold for these data is dependent on 
the intended application. For example, if I wished to 
be reasonably certain of including 75 percent of nest 
sites in future samples, the default threshold of 0.46 or 
below should be applied. Conversely, if I had limited 
resources and wished to maximize my chances of find- 
ing a nest site in an unsurveyed area (large PPP), I 
would need to apply a threshold of 1.84 or above. If 
relative costs are assigned to the incorrect predictions, 
an optimum threshold can be determined from a ROC 
plot. As the relative cost of the omission errors in- 
creases, the slope of the tangent reduces and the 
threshold declines. For example, using the testing 
data, a relative omission cost of two suggests a thresh- 
old of 1.91, while a relative cost of ten decreases the 
threshold to 0.79. 


Summary 


Most ecologists do not undertake the development of 
predictive models lightly. Considerable effort is put 
into ensuring that the ecology and the analyses are 
sensible and appropriate. Unfortunately, this effort is 
not usually carried through to an appropriate assess- 
ment of the accuracy of the predictions. In this chap- 
ter, I have attempted to raise the profile of this aspect 
of distribution modeling. Although it is difficult to 
be prescriptive, the best method would appear to be 
the use of the AUC from a ROC plot, because it is 
threshold independent and provides an indication of 
how well the two classes are separated (Bradley 
1997). I would also emphasize the need for rigorous 
testing using independent data. Whenever possible, 
this should involve some truly independent data ob- 
tained through prospective sampling. A more de- 
tailed discussion of the construction and assessment 
of classification rules can be found in Hand (1997). 
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A Minimalist Approach 
to Mapping Species’ Habitat: 
Pearson’s Planes of Closest Fit 


John T. Rotenberry, Steven T. Knick, and James E. Dunn 


dentification of probable use areas by animals is im- 

portant for land-use planning, identification of con- 
servation regions, and ecological studies of spatial dis- 
tribution, movements, and resource use. Many 
statistical methods, often extensions of nonspatial re- 
source selection models (e.g., Manly et al. 1993), in- 
creasingly are employed in geographical information 
systems (GIS) to determine the value of an index of 
use or likelihood of occurrence of a species at each 
point or grid cell within a study area based on the 
multivariate configuration of habitat variables at those 
points. The resulting maps then depict spatial varia- 
tion in potential animal use, often at relatively fine res- 
olution over large areas. 

An issue that frequently arises in spatial mapping 
exercises is that whatever model is developed must be 
applied to points and even landscapes not originally 
sampled for the target organism. Indeed, the extrapo- 
lation of such GIS-based models forms the core of 
their use in identification of areas of conservation in- 
terest. However, spatial mapping extends beyond sim- 
ply developing and calibrating selection models for 
prediction; statistical models that have a large R? may 
have little predictive capability (Rotenberry 1986; 
Gauch 1993; ter Braak and Looman 1995). This 
problem may be especially pronounced when attempt- 
ing to predict animal use regions outside the immedi- 
ate study area or in areas undergoing change, which 


may contain landscape configurations not present in 
developing the original selection model. The solution 
to this problem is significant not only for the capabil- 
ity to reliably predict animal use areas without addi- 
tional sampling but also because of the ecological im- 
plication that such a model can identify the basic 
requirements of animals that may be present in any al- 
ternative environment. 

In this chapter, we discuss one emerging model that 
has been used in just such a GIS-mapping context, 
Mahalanobis D2 (Clark et al. 1993b), briefly describ- 
ing its biological inferences and how that contributes 
to its successes and failures (Knick and Rotenberry 
1998). We then describe the biological attributes that 
a more appropriate model should embody. Finally, we 
provide an alternate statistical model based on the de- 
composition of D2 and show how it incorporates 
those attributes. 


Mahalanobis D? 


The D2 technique for mapping the probability of ani- 
mal use or occupancy of a point is based on the gener- 
alized squared distance, D2(y), as a measure of the dis- 
similarity between a p-dimensional vector y and a 
sample mean vector p of the same dimension, in a 
space standardized by X, the variance-covariance ma- 
trix of the sample (Clark et al. 1993b). Formally: 

H = occupied habitat, an n x p matrix of p variables 
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measured at n points where a species was detected. Fre- 
quently, this may be abstracted from a much-larger set 
of points that were surveyed but some of which did not 
include the target species. 

Eqn. 22.1: Mahalanobis distance D2: 


D2(y) = (y - py X3 (y - m) 


Where m = vector of means based on H (p x 1) (i.e., 
the centroid), 
y = vector of measurements on any point (p x 
1; may or may not be taken from H); thus y — 
i. is a vector of deviations of a point from a 
species’ mean vector, 
£ = variance-covariance matrix based on H (p 
x p), and 
D? is a squared scalar distance, standardized 
in the X metric. 


Thus, any point may be described by its distance from 
the centroid of occupied habitat. Presumably, the 
closer it lies to this centroid, the more it resembles oc- 
cupied (or occupiable, if we are predicting) habitat. As 
a hypothetical example, we might measure the size of 
the contiguous patch of shrubland in which each point 
occupied by a species lies. This will generate a mean 
occupied patch size (the univariate equivalent to p) 
and its variance (the univariate equivalent to X). Any 
individual point (occupied or not, from the same or a 
new sample) will also have a value for patch size 
(equivalent to y). The difference (*distance") between 
the point and the mean (scaled by the variance) is a 
measure of the point's similarity to occupied habitat, 
and, presumably, more similar is better. 

D? has been used successfully to link habitat use 
with a GIS to describe the distribution of species as 
disparate as black bears (Ursus americanus) in 
Arkansas forests (Clark et al. 1993b) and black-tailed 
jackrabbits (Lepus californicus) in Idaho shrubsteppe 
(Knick and Dyer 1997). However, an attempt to ex- 
trapolate the model developed in the jackrabbit exam- 
ple to shrubsteppe landscapes evolving under either 
natural or managed patterns of fire and plant commu- 
nity succession was far less successful (Knick and 
Rotenberry 1998). This latter exercise highlighted the 
biological inferences underlying the use of D2 and 


pointed out some of its shortcomings, which are dis- 
cussed below. 

One of the advantages of using D? is that one need 
measure only the set of *used" or occupied points. 
This avoids the ambiguity associated with including 
*unused" points (points surveyed but at which the 
species was not detected) in any resource selection 
model used for mapping. We assume that more often 
than not a species is at a point at which it is detected 
because it *wants" to be, that the habitat there satis- 
fies some basic requirement. Points at which the 
species was not detected are ambiguous in the sense 
that they could represent habitat that is suitable but 
currently unoccupied; habitat that is suitable and oc- 
cupied, but the observer failed to detect the species; or 
habitat that is unsuitable. Although consideration of 
“unused” points may be necessary to identify patterns 
of nonrandom resource use, such patterns will display 
a strong sampling scale-dependency (Wiens and 
Rotenberry 1981b; Wiens 19892), a feature undesir- 
able when extrapolating to new areas (which carries 
an implicit change in scale). Thus, use of D2 does 
not depend on any arbitrary definition of study 
boundaries. 

An additional useful attribute of D2 is that devia- 
tions are scaled by the variance-covariance matrix. 
This not only standardizes across variables, but also 
explicitly incorporates their intercorrelations, which 
otherwise may cause problems in multiple regression- 
based approaches. 

Use of D? for mapping explicitly assumes a selec- 
tion function that is based on an observed multivariate 
mean (p) and its covariance matrix (X); the farther 
away from that mean, in standardized units, of any 
point y, the less appropriate the habitat. Therefore, 
any deviation from the original habitat mean vector, 
even if it is in a direction that is actually biologically 
positive, is translated as less-desirable habitat. Return- 
ing to the hypothetical example of shrubland patch 
size, suppose that the species in question thrives as it 
finds itself in larger and larger patches, although it can 
hang on in smaller ones, which are occupied at a low 
frequency. Further suppose that the originally sampled 
landscape contained only relatively small patches 
(hence, the mean occupied patch size would be small). 
Were we to sample a new landscape with larger 
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patches, mapping D? would indicate that points 
falling in large patches would actually be less suitable, 
because they deviate from the mean occupied patch 
size in the original sample. A phenomenon similar to 
this appeared to be responsible for the lack of success 
in extrapolating jackrabbit distributions in response to 
vegetation change cited above (Knick and Rotenberry 
1998). 

A further implication of the assumption that the se- 
lection function can be characterized by an observed 
mean and variance is that the original sample reflects 
the optimal habitat distribution of the animals in the 
sampled landscape. As a corollary, it assumes that the 
selection response has been fully characterized (at 
least in the vicinity of the mean), or in other words, 
that p and X fully characterize the species response to 
habitat. This implies two additional features: the sam- 
pled area contains the full range of habitat variation 
to which the species responds, and we have identified 
and measured the appropriate variables (i.e., we have 
not left out any that are important, and we have not 
included any that are irrelevant). We leave it to the 
reader to judge the likelihood that these assumptions 
are met in any particular study, but we think the prob- 
ability can often be low. 

Although D2 appears to work well for static, well- 
sampled landscapes where a species' distribution has 
been fully characterized and one has a strong biologi- 
cal basis for selecting variables to include in modeling, 
it may not be appropriate for other common situa- 
tions (Knick and Rotenberry 1998). In particular, it 
may be prone to fail if applied to areas not included in 
the original sample or if applied to dynamic land- 
scapes, such as those that are disturbance prone 
(whether natural, such as fire, or anthropogenic, such 
as logging), or that are undergoing restoration or suc- 
cession. Yet, these are the landscapes for which map- 
ping applications may have the most value from a 
conservation perspective. 


An Alternative Model 


Mapping techniques based on dissimilarity to an opti- 
mum configuration may not be ideal for predicting an- 
imal use areas in a changing environment because we 
can never be certain of defining a biological optimum. 


We propose, instead, that identification of a minimum 
set of basic habitat requirements of a species is more 
appropriate for predicting potential animal use in 
changing environments. We also propose that such re- 
quirements are easier to identify correctly from a sam- 
ple than is an optimal vector. 

We assume that there is *habitat selection." For an 
animal to occur at a point, we assume that the values 
of some of the variables at the point, either singly or 
in combination, satisfy some basic requirement of the 
species. Although not all points that contain appropri- 
ate values of the variables are necessarily occupied, we 
do assume that points that are occupied represent at 
least some minimally suitable configuration of habitat. 
Furthermore, some habitat variables, although meas- 
ured, are irrelevant. Whether correlated with *impor- 
tant" variables or not, we want to ensure that they 
have only a minimal effect in obscuring the detection 
of the relevant patterns. 

We show below that D?(y) can be partitioned into p 
separate components, each representing the independ- 
ent squared, standardized distance between y and a 
*plane of closest fit" derived from H (Pearson 1901; 
Anderson 1958; Collins 1983). We suggest that the 
plane of closest fit constitutes a first-order, linear ap- 
proximation to the most basic combination of habitat 
variables required by a target species. 


Partitioning D? 


It is relatively easy to show that D? can be partitioned 
into p separate components. 

Eqn. 22.2a: The spectral decomposition of X (e.g., 
Seber 1984) is given by 


p 

, 

n= Ye; 
E 


Eqn. 22.2b: from which it follows that the spectral 
decomposition of X-1 is 


p 
-ips =} 4 
2 = DA; E 
j=l 


where À4 < . . . < Àp are the eigenvalues of È (for con- 
venience in reverse order from the usual principal 


components analysis presentation), and a; . . . @p are 
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their associated eigenvectors (p x 1), normalized to 
length one. 

Eqn. 22.2c: Substituting Eqn. 22.2b into Eqn. 22.1 
yields 


D'(y)- Diy -my’e,a,(y -m)/À, 


y 


To see the partitioning of D2 more clearly, 
Eqn. 22.2d: Let 


dj = (y - py’ oj 


Note that d; will be a scalar, the result of multiply- 
ing vectors (1 x p)(p x 1). Also note that 


(y - p)’ oj =a; (y - p) 


Eqn. 22.2e: Substituting Eqn. 22.2d into Eqn. 
29090 


p 
D^(y)- V 42/4, 
j=l 


And thus D? can be partitioned into p scalar com- 
ponents. The task remains to attach meaning to these 
components, which we do by showing their correspon- 
dence to Pearson’s planes of closest fit (Pearson 1901). 


Ecological Rationale for 
Pearson’s Planes of Closest Fit 


We want to identify the constant relationship in a 
species’ distribution (i.e., which functions of the vari- 
ables maintain a consistent value where the species oc- 
curs). These variables may be thought of as represent- 
ing basic requisites of the species. Functions that have 
a relatively high variance (take on many different val- 
ues) are less likely to be informative since such func- 
tions are not restrictive of a species’ distribution, at 
least over the range of variation sampled. This concept 
has also been presented by Collins (1983) and Knopf 
et al. (1990), although in substantially different forms. 

Imagine that we perform a principal components 
analysis of H. Our interpretation of the resulting com- 
ponents is that they represent a rigid rotation of the 
original variable axes to a new set of axes (the compo- 
nents) such that the first component accounts for the 


maximum amount of variation in the original disper- 
sion of data points, the second accounts for the maxi- 
mum amount of remaining variation and is orthogo- 
nal to the first, and so forth. The position of these new 
axes with respect to the original ones is given by their 
eigenvectors, and the variances of data projected on 
these axes are given by their eigenvalues. Convention- 
ally, we focus on the first few components, those with 
large eigenvalues (i.e., those with large variances). 
However, given that the interpretation in the preced- 
ing paragraph suggests we should be looking for low 
variances, those components are exactly the wrong 
ones. The dimensions with low variances are the com- 
ponents at the other end, the ones with the smallest 
eigenvalues. The idea is that we can partition variation 
in H into components that represent basic habitat req- 
uisites versus those that do not. We can then discard 
from further consideration those components that are 
not restrictive to species utilization. 

Pearson (1901) defined the concept of a plane (or 
line) of closest fit to a system of points: that plane for 
which the sums of squares of the perpendiculars from 
the system of points to the plane is a minimum. Obvi- 
ously, then, the variance of these projections of points 
on a vector normal to such a plane will be a minimum 
as well. This is one of the principal attributes we are 
seeking; if we can identify such a plane based on 
measurement of habitat variables, then the distance 
from any point to that plane carries information about 
its relative suitability as habitat as related to the basic 
requirements of a species. Necessarily, each successive 
plane (1) passes through the centroid of the system, 
and (2) represents one additional restriction among 
the variables that characterize the habitat require- 
ments of a species. 


Relation between the Partitions of D? 
and Planes of Closest Fit 


Eqn. 22.3a: From standard principal components 
analysis, 


T 
D E 


where x; is the length of the projection of y on an axis 


defined by a. 
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Eqn. 22.3b: Also from principal 
analysis, 


components 


Var [xj] = je 


Eqn. 22.3c: From Pearson (1901), o; is normal to a 
p — 1 dimensional hyperplane defined by 


(y-p)'o;20 


Therefore, the perpendicular deviation of (y — p) 
from this hyperplane is identical to its projection on 
the axis defined by «v, so that the variance of these dis- 
tances is also X. 

Thus, the plane of closest fit is (y — py a, = 0, since 
the deviation dı = (y — p) e, has the smallest vari- 
ance, namely A4. Therefore d42/A44 = (d4/424)?. repre- 
sents the square of this deviation in standard measure. 

The second-best p — 1 dimensional hyperplane, 
which satisfies corr[d;,d;] = 0, is defined by (y — pY o; 
= 0, with deviation d = (y — p)’ e? and variance M, 
and a squared standardized distance of d52/^5, and so 
forth. 

Thus D2(y) represents a sum of squared deviations, 
in standardized measure, of a particular point with co- 
ordinates given by y from each of p, p — 1 dimensional 
hyperplanes, all of which pass through the centroid of 
the original p-dimensional sample (H): 

Eqn. 22.4a: 


D2(y) = di2/4 +... + dg?/Àk +.. . + d], 


Our major premise is that not all of the p compo- 
nents of D2(y) as partitioned above define limiting 
combinations of habitat variables. Some p — k of these 
do not define habitat suitability but rather are in- 
cluded in D2(y) simply because the investigator de- 
cided a priori to measure p habitat variables. Certainly 
the hyperplane corresponding to the first principal 
component cannot be considered a limitation since the 
variance of deviations from this hyperplane is Àp, the 
maximum possible. Despite the fact that a, defines the 
axis of maximum possible variation measured from its 
normal hyperplane, it continues to contribute to 
D2(y). Thus, we propose that habitat suitability for a 
p-dimensional y be measured by 

Eqn. 22.4b: 


D2(y;k) = dy2/Ay +... + d?/Àk 
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for some 1 < k < p, where the eigenvalues of X are or- 
dered Ay € ... € àp. Thus, suitability of a particular 
habitat location y for a species would be measured in 
terms of deviations from k basic requirements for that 
species, to the extent that we are able to know k. 


Methods 


We compared predictive models for D2(y) and D(y;k) 
using presence/absence data on sage sparrows (Am- 
phispiza belli), a shrubland obligate species breeding in 
southwestern Idaho (Knick and Rotenberry 1995, 
1998; Rotenberry 1998). Much of this region, origi- 
nally dominated by shrubland communities character- 
ized by sagebrush (Artemisia tridentata), winterfat 
(Krascheninnikovia lanata), or shad-scale (Atriplex con- 
fertifolia), is currently in transition from a shrubland to 
a grassland-dominated state because exotic annual 
grasses have changed the size and frequency of wildfires 
(USDI 1996; Knick and Rotenberry 1997; Rotenberry 
1998). Therefore, predicting animal distributions after 
landscape changes caused either by fires or by restora- 
tion efforts is an important management concern. 

Our sampling design has been described in detail 
elsewhere (Knick and Rotenberry 1995, 1999). Briefly, 
we determined the presence of each species from sur- 
veys at 121 sites scattered throughout the Snake River 
Birds of Prey National Conservation Area (NCA). An 
additional thirty-nine points located at randomly deter- 
mined coordinates were surveyed for model verifica- 
tion. We developed H (and the sample-based equiva- 
lents of and X) from the characteristics of each point 
where the species was detected in at least three of five 
years. We placed each survey point into our GIS 
(USACOE 1993) and calculated the total number of 
shrubland cells and the mean patch size of shrublands 
for areas within 0.5- and 2-kilometer radii around the 
point (Baker and Cai 1992). The map depicting the cur- 
rent landscape (see Fig. 22.1 in color section, top) had 
an 80 percent accuracy in separating shrub- and grass- 
dominated cells (Knick et al. 1997). We resampled the 
map to 150-meter cells from an original resolution of 
30 meters. 

We used a computer simulation to project land- 
scapes in our study area that might result from contin- 
uing current trends of extensive fires and subsequent 
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loss of shrublands (Fig. 22.1, middle), or, alternatively, 
from management for active fire suppression and 
shrub restoration (Fig. 22.1, bottom) (Knick et al. 
1996). Under the first scenario, we expect a decrease 
in the amount of habitat suitable for sage sparrows, 
whereas under the second we expect increasing re- 
gional suitability based on what we currently know 
about habitat associations of sage sparrows (e.g., 
Rotenberry and Wiens 1980; Wiens and Rotenberry 
1981a,b; Knick and Rotenberry 1995, 1999; Martin 
and Carlson 1998; Rotenberry and Knick 1999). We 
note that we could have specified any arbitrary land- 
scape to examine how its suitability for the target 
species changed; we simply used the simulation model 
to generate landscapes that have a good probability of 
developing from the current configuration. 

We measured habitat suitability for the four-dimen- 
sional y represented by any point on the map by 
D2(y;k) = d??/^; (Eqn. 22.4b) for 1 € k < p, where the 
eigenvalues of X (or its sample analog) were ordered 
Ay €... € àp. For each landscape (current plus simu- 
lated scenarios for burned and restored landscapes), 
we calculated D2(y) and D2(y;k) for each cell.313 


Results 


The two simulated landscape development scenarios 
produced fundamentally different landscape patterns ( 
Fig. 22.1). Under continued trends of extensive and 
repeated fires, the modeled region became dominated 
by grasslands (Fig. 22.1, middle), whereas fire sup- 
pression and restoration activities produced a land- 
scape with extensive coverage of native shrublands 


TABLE 22.2. 


TABLE 22.1. 


Mean habitat vector (p ) for sage sparrows (Amphispiza belli). 


Scale Variable Mean 
0.5-km radius 96 shrubland cells TL 
shrubland patch size 258 
2-km radius 96 shrubland cells 59.2 
shrubland patch size 153.7 


Note: 1. n = 36 occupied points. 2. Habitat characteristics were 
determined in a GIS for a landscape classified into shrubland or 
grassland categories. Percent shrubland cells represents the percentage 
of 150-meter cells within the sampling radius that were classified into 
the shrub category. Shrubland patch size represents the mean number 
of cells for all shrubland patches within the sampling radius. 


(Fig. 22.1, bottom), even more so than in the current, 
fire-dominated landscape (compare to Fig. 22.1, top). 

Using the current landscape habitat composition, 
we developed the sample-based equivalents of p 
(Table 22.1) and € (Table 22.2), then used principal 
components analysis to generate the associated eigen- 
values, Jy, and eigenvectors, aj, (Table 22.3) from H, 
which was defined by n = 36 points where we ob- 
served sage sparrows in at least three of five years, and 
p = 4 variables. 

The map of predicted use areas for the D2(y) model 
using four variables for the current landscape (see Fig. 
22.2 in color section, top) reflected the distribution of 
shrublands. As expected, the D2(y) model predicted 
use by sage sparrows only in those regions where 
shrublands remained in the burned landscape (Fig. 
22.2, middle). However, the D2(y) model did not pre- 
dict use by sage sparrows in the larger shrublands in 
the restored landscape (Fig. 22.2, bottom). Instead, 
the D*(y) model predicted use only on the edges of 


Variance-covariance matrix (S) for sage sparrows (Amphispiza belli). 


0.5-km radius 2-km radius 
No. shrubland Shrubland No. shrubland Shrubland 
cells patch size celis patch size 
0.5-km radius % shrubland cells 0.0585 0.0500 2.4999 26.2929 
Shrubland patch size 0.0500 0.0615 2.2324 39.6685 
2-km radius % shrubland cells 2.4999 2.2324 122.8272 1214.2648 
Shrubland patch size 26.2929 39.6685 1214.2648 38923.342 


Note: n = 36 occupied points. 


TABLE 22.3. 
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Eigenvectors and eigenvalues of S, based on H of sage 
sparrows (Amphispiza belli). 


k 
1 2 3 4 
Eigen- 
value 0.0047789 0.0120878 84.908618  38961.364 
Eigen- 
vector 0.7923471 0.6097501 0.0197589  0.0006765 
-0.610011 0.7923062 0.0116852  0.0010194 
-0.008523 -0.02134 0.9992474  0.0312492 
0.0003523 -0.000554 -0.031266 0.9995109 


Note: n = 36 occupied points. 


large shrubland patches, which were areas that most 
closely resembled the number of shrub cells and mean 
patch sizes of the originally sampled landscape. 

Eigenvalues associated with each of the four planes 
represented successively increased relaxation of re- 
strictions on the habitat matrix of sage sparrows 
(Table 22.3). The frequency distributions of values of 
d?/X; for the combined verification and sample sites 
where we observed sage sparrows also reflected the 
decreased deviation from zero (Fig. 22.3). In particu- 
lar, sage sparrows were much more likely to be ob- 
served at points lying close to the planes defined by & 
= 2 and, especially, k = 1. There was no discernable 
pattern of sage sparrow presence with respect to the 
plane k = 3, and a strong negative relationship with 
plane k = 4 (Fig. 22.3). 

The pattern of predicted use by sage sparrows in 
the current and burned landscapes produced by the 
D2(y;k = 1) model resembled that modeled by D?(y) 
(compare Fig. 22.4 in color section, top and middle, to 
Fig. 22.2, top and middle). However, use areas pre- 
dicted by D2(y;k = 1) more closely tracked expected 
changes in distribution of sage sparrows in the re- 
stored landscape than the D2(y) model. Unlike D?(y), 
the D2(y;k = 1) model predicted use within the large 
shrubland patches (Fig. 22.4 bottom). 


Discussion 


We present an alternative model, D2(y;&), that predicts 
animal use based on a minimum combination of a 


PLS) Tf 


species requirements rather than on a habitat’s relative 
similarity to a mean or optimum set of conditions as 
modeled by D2(y). The maps of predicted use by sage 
sparrows, based on a minimum set of habitat condi- 
tions and modeled by the value of D2(y;k =1) relative 
to the first plane, the plane of closest fit, tracked our 
expected response to changes in landscape configura- 
tion. In contrast, the D2(y) model, although accurate 
for the landscape configuration in which H was ob- 
tained (see Knick and Rotenberry 1999 for details), 
was unable to track changes outside of the original 
habitat. Our study emphasized that D2(y) represents 
the mean habitat vector of species presence only for a 
specific sampled landscape and does not necessarily 
represent the mean or optimum set of conditions for 
the species. 

Use of D2(y;k) has the potential for coping with an 
evolving environment in a way that D2(y) = D?(y;p) 
does not. For D2(y) = D?(y;p), habitat suitability is 
limited in each of the p possible dimensions. For D2(y) 
= D2(y;k), habitat suitability is limited in only k 
dimensions; unlimited habitat variation may occur 
in any of the p — k remaining dimensions without 
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Sparrows (Amphispiza belli) as a function of standardized dis- 
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affecting the value of D2(y;k). The only requirement is 
that habitat variation not proceed in any of the direc- 
tions parallel to mutually orthogonal axes defined by 
Q1, . . . , a. This is to say that habitat change may 
occur parallel to any of the axes defined by ag,1,..., 
a, without affecting D?(y;&). 

The choice of k is likely to be somewhat qualita- 
tive, depending on the magnitudes and relative spac- 
ing among the eigenvalues of €, or its sample analog, 
and the credibility of the GIS results that occur by the 
use of particular choices of k. Some guidance might be 
provided by considering a sequence of tests of the 
sphericity hypothesis based on eigenvalues of the cor- 
relation matrix associated with X (e.g., Morrison 
1990). 

The question of what constitutes the *best" origi- 
nal variables to measure remains at the discretion of 
the investigator. In our approach, a good variable is 
one that takes on certain values that are closely associ- 
ated with the occurrence of a species. We assume that 
an investigator has some criteria (e.g., biological intu- 
ition, personal experience, previous studies) on which 
to base initial variable selection. Theoretically, our ap- 
proach should accommodate the inclusion of less- 
than-useful variables by shunting them to the 
less-than-useful components. Whether this extends to 
variables to which a species responds to in a nonlinear 
or threshold manner remains to be investigated. 

Each of the four planes describing sage sparrow 
presence represented increasing variance in the com- 
bination of variables that described the set of used 
habitats. In other words, each additional plane repre- 
sented a less-consistent set of habitat conditions de- 
scribing the sites used by sage sparrows. To this 
point, we can offer no more than a geometric inter- 
pretation of the plane of closest fit; identification of 
"important variables" in the original measurement 
space remains problematic. Although Collins (1983) 
briefly mentions planes of closest fit in an analysis of 
geographic variation in avian habitat selection, we 
do not believe that his interpretation of the technique 
as involving the variables with the largest correla- 
tions (i.e., factor loadings) on the principal compo- 
nents with the lowest eigenvalues is necessarily 
correct. Correlation assumes a comparison of projec- 
tions of points on two axes, whereas our problem in- 


volves characterizing the association between each 
measurement axis and a plane. 


Future Research 


The approach outlined above is clearly only a start, 
and much remains to be learned of its limitations be- 
fore it we can confidently substitute it for most pre- 
vious approaches. For example, our current *verifi- 
cation" of the model consists only of the perception 
that it produces a distribution of likely sage sparrow 
habitat that much more closely mirrors our biologi- 
cally based intuitive expectation than does a compet- 
ing model. Assessments of accuracy explored in this 
volume should be applied in a more extensive test of 
the model. Likewise, our analysis is based on a rela- 
tively small number of observations and only four, 
somewhat artificial, variables. This is not an alto- 
gether trivial data set, however. Although the num- 
ber of observations is low, it is indicative of the like- 
lihood of detecting this species throughout the 
immediate region. Likewise, these variables were se- 
lected because (1) we expected them to differ among 
our simulated landscapes, and (2) we expected that 
sage sparrow distribution would be related to these 
measurements. We are currently working on a larger 
data set with more species, more species observations 
(including an independent verification set), and a 
wider array of variables to probe the model more 
deeply. Finally, our inclusion of a point within the 
data set was based only on species presence. If other 
measures of a species’ performance at a point are 
available (e.g., breeding success), then we may be 
able to interpret planes of closest fit in a context 
more directly related to a species’ evolutionary 
fitness. 

Three statistical issues remain to be elucidated with 
further work as well. First, we need to explore differ- 
ences arising from the use of unstandardized versus 
standardized data. We employed the former, but it is 
reasonable to expect that use of the latter may yield 
different patterns, since eigenvalues and eigenvectors 
can change with changes in measurement scales. Sec- 
ond, we need to develop a method that permits more 
than simply a geometric interpretation of a plane of 
closest fit. Interpreting planes in the context of the 
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habitat variables that were originally measured will 
assist a manager in developing any habitat manage- 
ment scheme, as well as provide information to a biol- 
ogist seeking to understand the mechanisms underly- 
ing a habitat association. Finally, we need to devise 
more quantitative methods for selecting k. This is 
likely to prove analogous to attempts to determine the 
number of “significant” principal components, a 
search that is still underway (Jackson 1993). 

In conclusion, we believe that the statistical ap- 
proach we have outlined above contains biologically 
realistic assumptions, assumptions that are more likely 
to be met under current sampling regimes than alter- 
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native analyses require. Moreover, this technique is 
more able to cope with evolving environments and is 
more easily extensible to previously unsampled envi- 
ronments. As such, we believe it will be most useful in 
identification of areas of conservation interest and 
their applications to land-use planning. 
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Geospatial Data in Time: - 
Limits and Prospects for 
Predicting Species Occurrences 


Geoffrey M. Henebry and James W. Merchant 


R obust integration of time into geospatial (or geo- 
graphic) information systems (GIS) is a grand 
challenge but one that must be addressed if we are to 
monitor, model, and manage biodiversity. In this chap- 
ter, we examine some of the deeper conceptual issues 
that must be confronted in order to predict species dis- 
tributions in space and time. Conceptual challenges 
are often resolved through development of particular 
algorithms and specific technological implementa- 
tions. It can be difficult for the practitioner or the 
manager to discern, amidst a choice of GIS tools, the 
underlying worldview that informs, defines, and limits 
these tools. We shall touch upon several key chal- 
lenges to integrating geospatial data effectively into 
predictive models of habitat. These issues lie deeper 
than the choice of minimum mapping unit or image 
resolution, and they highlight current limitations in 
theory and point toward future directions for research 
and development. Some of the visceral unease of GIS 
expressed by Van Horne (Chapter 4) is justified: the 
tools may appear more robust and more objective 
than is the case. Yet, as we practice with tools, we 
learn to use them better, as Stauffer (Chapter 3) relates 
in his historical review of animal habitat modeling. 
There are currently two fundamental approaches to 
estimating spatial occurrence of species: (1) range 
maps interpolated and/or extrapolated from sparsely 
sampled field observations (e.g., museum voucher 


specimens, county dot maps), and (2) potential range 
maps developed from modeled habitat availability. In 
both cases, range maps are typically portrayed as 
static. We shall focus on issues in modeling habitat 
within the context of GIS and will highlight some of 
the current conceptual and technical difficulties en- 
countered when introducing temporal structures into 
geospatial data. For illustration, we examine the chal- 
lenges in predicting the spatial distribution of habitat 
for an anuran species, the Great Plains toad, Bufo cog- 
natus. We conclude with a look to the prospects for 
improving prediction of species occurrences using new 
types of geospatial data. 


The Challenge of Geospatial Data in Time 


Species occur in space through time. Each species has 
a range of tempos and spatial extents and resolutions 
associated with the resources it requires for persist- 
ence. These resources can be available recurrently or 
episodically in habitat that is distributed across the 
landscape. Ecological patterns and processes weave 
their mutual causality through space and time (Turner 
1989; Allen and Hoekstra 1992; Brown 1995; Maurer 
1999); thus, if species are to be viewed as spatiotem- 
poral entities, then habitat characterizations must also 
include dynamical portraits of (at least) the limiting 
resources. 

Tracking resource location in space and time 
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requires time series of geospatial data. Data are 
geospatial when they are organized with coordinates 
that relate sampling points to locations on the planet 
using some two-dimensional or three-dimensional po- 
sitional reference system. Not all spatial data are 
geospatial because georeferencing is not always feasi- 
ble, especially for legacy datasets. Today, GISs are 
commonly used to facilitate the organization, manipu- 
lation, visualization, and analysis of geospatial data 
(Hunsaker et al. 1993; McCloy 1995). Habitat model- 
ing with geospatial data is, however, fraught with sig- 
nificant methodological challenges (aside from the fa- 
miliar and chronic problem of data availability). 
Major issues that need to be addressed include auto- 
correlation, issues of rescaling, model validation, and 
data representation, including the representation of 
time in GIS. 


Autocorrelation 


The issue of autocorrelation faces any treatment of 
spatial data. Autocorrelation is a characteristic of data 
derived from a process that is articulated in one or 
more dimensions and it describes the error structure of 
the data (Cliff and Ord 1981; Henebry 1995). 
Geospatial data frequently exhibit positive spatial au- 
tocorrelation, that is, the residual deviations are less 
variable than random expectation because proximate 
neighbors are more similar than distant neighbors. 
This situation complicates model parameter estima- 
tion, model error analysis, and identification of signif- 
icant statistical relationships between variables be- 
cause it enhances the risk of making a Type I error, or 
in other words, rejecting the null hypothesis when it 
should, in fact, be accepted. The naive regression of 
one mapped variable against another will fall into this 
trap of inflated and erroneous significance levels. Spa- 
tiotemporal autocorrelation arises naturally from dy- 
namical portrayals of spatially explicit ecological 
processes (Henebry 1993). Density-dependent pro- 
cesses can readily produce negative spatiotemporal au- 
tocorrelation (Henebry 1995). The hazard in this situ- 
ation arises from residuals that are more variable than 
random expectation increasing the risk of a Type H 
error, or in other words, accepting the null hypothesis 
when it should be rejected. Autocorrelation among 


mapped attributes has also been found to be a princi- 
pal source of error propagation in GIS models based 
on simple map analyses, such as basic arithmetic oper- 
ations and superpositioning (Arbia et al. 1998; Grif- 
fith et al. 1999). 

Autocorrelation can be harnessed, however, as a ro- 
bust indicator of spatial structure and its temporal de- 
velopment (Henebry 1993; Henebry and Su 1993). In- 
formation on the arrangement of habitat involves 
spatial autocorrelation. Hiebeler (2000) demonstrates 
how the intensity of positive spatial autocorrelation— 
via habitat clustering—affects population processes. 
He argues that much of the ecological research devel- 
oped on the concept of patch distributions could be 
more robustly articulated in terms of clustering inten- 
sities. Positive spatial autocorrelation is expected to be 
higher within habitat patches than at the margins 
(Brown et al. 1995; Maurer, Chapter 9). It is just this 
positive spatial autocorrelation within habitat patches 
that may be responsible for the apparent success of 
Pearson's planes of closest fit (Rotenberry et al., Chap- 
ter 22): high positive spatial autocorrelation of a habi- 
tat feature minimizes its residual variance and thus the 
feature appears as a habitat constant. 

What are some options for dealing with spatial au- 
tocorrelation? There are two general strategies for 
coping with-the negative consequences of spatial auto- 
correlation on statistical inference: explicit incorpora- 
tion of local spatial interactions and distribution-free 
methods. Researchers in the field of spatial economet- 
rics have pioneered the former approach (Anselin 
1988; Pace and Barry 1997; Bivand 1998). Autologis- 
tic regression is one form of this approach: Klute et al. 
(Chapter 27) demonstrate how the autologistic model 
incorporates spatial autocorrelation explicitly and 
thereby improves predictive performance over stan- 
dard logistic regression. Distribution-free methods rely 
on randomization (permutational and combinatorial 
techniques), resampling (bootstrap and jackknife), and 
Monte Carlo sampling to assess the spatial, temporal, 
or spatiotemporal configurations of samples against 
empirical distributions synthesized using computer-in- 
tensive methods (Manly 1997). l 

Fielding and Bell (1997) in a review of methods for 
error assessment of presence/absence models touched 
on how spatial autocorrelation and spatial context af- 


fect prediction errors. They recommended some useful 
strategies for dealing with the spatial dimension, but 
as presence/absence models typically lack dynamical 
structures, they did not broach the issue of spatiotem- 
poral autocorrelation and its effects on model predic- 
tions and patterns of errors. Fielding (Chapter 21) 
mentions how differential spatial weighting of predic- 
tion errors enables observational context to be intro- 
duced explicitly into model assessment. 


Rescaling Relationships 


Significant differences often exist between the spatial 
and temporal scales at which data are collected—via 
remote sensing, GIS, and field methods—and the 
scales at which ecological and environmental models 
operate (Goodchild 1997). Moreover, as some 
processes require analyses at multiple scales, nested 
and coupled modeling strategies are being used with 
increasing frequency (Bian 1997). Mismatches be- 
tween scales of activity, observation, and modeling 
can lead to inappropriate rescaling of rates and rela- 
tionships. The general problem of scale dependence 
has been investigated for spatial data in geography as 
the Modifiable Areal Unit Problem (MAUP) (Open- 
shaw 1984; Arbia 1989; Jelinski and Wu 1996) and 
for temporal data in ecology as transmutation 
(O’Neill 1979; King et al. 1991) and aggregation error 
(Cale and Odell 1980; Gardner et al. 1982; Rastetter 
et al. 1992). 

The MAUP stems from twin errors. First, there is 
the problem of assuming implicitly or explicitly that 
patch-specific data applies to all individuals dwelling 
in the patch. Geographers have dubbed this imprudent 
inferential practice the “ecological fallacy.” Second, 
there is the problem of treating spatially aggregated 
data as individual observations in analysis. There are 
no natural a priori spatial units; they are all imposed 
by our observational processes (Allen and Starr 1982; 
Allen and Hoekstra 1992). Thus, delineations between 
patches are arbitrary and may be imprecise in loca- 
tion, transitory in duration, and irrelevant to underly- 
ing population structure. 

The principal undesirable consequence of the 
MAUP is equivocal statistical analysis: by simply vary- 
ing either data resolution through aggregation or data 
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allocation through alternative zonations, the entire 
spectrum of correlations may be extracted from the 
same dataset (Openshaw 1984; Arbia 1989). By en- 
abling the user to define and redefine areal units, GIS 
can actually exacerbate the MAUP and promote dis- 
covery of spurious correlative relationships (Open- 
shaw and Alvanides 1999). 

There are no objective solutions to the MAUP. 
Rather, its effects can be attenuated by knowing well 
the system under study in order to be able to specify 
ecologically. meaningful ways to (dis)aggregate data. 
Huston (1999) argued for the need to distinguish care- 
fully between processes occurring at biogeographic 
and local scales to predict variation in species assem- 
blages. Trani (Chapter 11) indirectly approached the 
MAUP through a comparative analysis of the scaling 
behavior of spatial pattern metrics. The MAUP lurks 
also in the amphibian habitat modeling effort de- 
scribed by Johnson et al. (Chapter 12). Working with 
a wealth of more than two hundred explanatory vari- 
ables and a rich and recent occurrence record, data-re- 
duction procedures were undertaken to attenuate mul- 
ticollinearity to yield a set of explanatory variables at 
three spatial extents: site-specific (from field data), 
local landscape (1-2-kilometer buffer around sites), 
and broader landscape (10-kilometer buffer). Al- 
though they found the variation encountered in 
species response to landscape habitat metrics unex- 
pected, it is not clear how dependent the results are on 
the assignment of buffer extents (Johnson et al., Chap- 
ter 2): 

Transmutation error arises when rates are naively 
extrapolated. For example, rescaling foraging rates 
observed only during the growing season to an annual 
basis obscures the temporal clustering of resource cap- 
ture. Aggregation error arises when multiple state 
variables with different characteristic rates are lumped 
into a single rate-limited state variable. For example, 
rescaling life-stage dependent mortality rates to a sin- 
gle population mortality rate by taking the simple av- 
erage obscures important demographic dynamics. 

The key to handling these kinds of modeling errors 
is to be aware that they exist and have potential to 
propagate uncertainty and degrade prediction accu- 
racy. Helpful tutorials illustrating the problems and a 
range of solutions are available (King 1991; King et al. 
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1991; Rastetter et al. 1992). In addition, Schneider 
(1994) provides a useful introduction to dimensional 
analysis that includes examples of analytical pitfalls 
commonly encountered when reconciling the scales of 
observation and prediction. 

The development of hierarchy theory in ecology has 
provided some useful conceptual tools to handle issues 
of scale deftly (Allen and Starr 1982; O'Neill et al. 
1986; Allen and Hoekstra 1992; Johnson 1996). Frac- 
tal neutral models (Milne 1991) and allometric power 
laws (West et al. 1997) offer robust means of reconcil- 
ing disparate scales (called renormalization by the 
physicists) as well as identifying changes in scaling be- 
havior. Milne et al. (1992) simulated herbivory across 
fractally structured landscapes to demonstrate how al- 
lometric scaling rules can reconcile the apparent be- 
havioral differences in hypothetical species. Along 
similar lines, Theobald and Hobbs (Chapter 59) illus- 
trated how the process of habitat fragmentation can 
be better described in terms of the functional rescaling 
of species body size and physiology. The common so- 
lution to rescaling problems is the judicious applica- 
tion of prior knowledge of how the system works to 
the modeling process. Modeling should not be con- 
ducted in a knowledge vacuum. Geospatial data com- 
plement, not substitute for, natural history and field 
data. 


Model Validation 


Model validation is never easy. The very spatial exten- 
siveness of geospatial data magnifies the validation 
problem. For example, methods to determine the ac- 
curacy of maps portraying land-cover characteristics 
derived from coarse-resolution space-borne sensors 
and mapped over vast areas are simply not well estab- 
lished. Traditional site-specific field-based verification 
methods, such as advocated by Congalton and Green 
(1999) and others, would require thousands of sam- 
ples collected within a narrow time window coinci- 
dent with sensor overpasses and thereby be prohibi- 
tively expensive, if even logistically feasible. Merchant 
et al. (1994) proposed that, at present, validation of 
large-area land-cover data sets will need to be a cumu- 
lative, multitiered process involving consideration of a 
number of different types of evidence that, collectively, 


will serve to support or refute the accuracy of a given 
product. Such multiple lines of evidence could include 
(1) selective site-specific field observations, (2) qualita- 
tive assessments of derived products by experts in 
land-cover characterization, (3) examination of the in- 
ternal logical consistency among variables within a 
multidimensional GIS in which land cover is one com- 
ponent layer, and (4) end-users’ assessments of the 
performance of derived habitat data in modeling 
species occurrences. 

The accuracy of maps predicting vertebrate range 
distributions are even more difficult to assess than 
maps portraying land cover or land use. Seasonality of 
habitat use and availability, nocturnal and fossorial 
habits, and low population densities can lead to low 
encounter rates. Regional to subcontinental distribu- 
tion of species, patchily distributed habitat, and inter- 
annual climatic variability can further complicate vali- 
dation efforts. 

Design-based inference offers an alternative evalua- 
tion framework to assess mapped habitat accuracy in 
a manner that is not sensitive to spatial autocorrela- 
tion. Stehman (2000) offers an accessible entry to this 
literature. A principal difference between design-based 
inference and the commonly employed model-based 
inference is the population targeted by the inference. 
Model-based inference seeks to discern the process or 
model that generates the observed sample: from pat- 
terns observed in the sample, an underlying process is 
inferred. Design-based inference takes as the object of 
interest an actual, well-defined, finite population of 
observations (such as a map) in order to describe char- 
acteristics of the population with some degree of toler- 
ance for uncertainty (Stehman 2000). In sharp con- 
trast to standard assumptions of model-based 
inference, design-based inference places few con- 
straints on the nature of the observations in order to 
conduct estimation of population characteristics. Ob- 
servations are not considered as random variables 
plucked from a specific probability distribution; rather 
they are considered fixed values with variability attrib- 
uted to sampling design. The most important conse- 
quence of this perspective is that the confounding ef- 
fects of spatial autocorrelation on estimation are 
effectively neutralized, although the precision of esti- 
mates is still affected adversely (Stehman 2000). There 


is, of course, a cost associated with the shift in infer- 
ence framework and relaxation of assumptions: there 
is no way to predict the accuracy of unobserved data. 

In a simulation study, Karl et al. (Chapter 51) ex- 
plored how “rarity” affected model accuracy. Esti- 
mates of commission error conflate actual failures to 
Observe a species in a predicted area with apparent 
failures due to low encounter rates (Boone and Krohn 
1999). Actual failures point to model lack of fit or, 
more seriously, model misspecification. Apparent fail- 
ures, on the other hand, simply indicate inadequate 
sampling. Karl et al. (Chapter 51) concluded that (1) if 
species respond to readily observable habitat features, 
and (2) if knowledge of habitat associations is more 
accurate for rare species, then habitat-relationship 
models developed for those rare species ought to per- 
form as accurately as predictive models for common 
species, despite the high error rates caused by small 
sample sizes. This provocative conclusion hinges criti- 
cally on untested assumptions. 

Model validation goes beyond assessing a model's 
predictive performance based on data that are either 
similar to those used to develop the model (the case of 
cross validation) or distinctly different (model verifica- 
tion) (Cale et al. 1983; Warwick and Cale 1988). It 
extends to evaluating model performance under ad- 
verse conditions, such as uncertainty in input data and 
model parameters (Warwick and Cale 1988; Henebry 
1995). Computer-intensive model-error analyses using 
Monte Carlo methods (Kalos and Whitlock 1986; 
Mowrer 1999) enable wide-ranging explorations of a 
model's parameter space to estimate model reliability, 
that is the probability that a calibrated model will cor- 
rectly predict to within a predetermined level of accu- 
racy (Warwick and Cale 1987). 

The advantage of the model reliability approach is 
that it imposes a decision-making framework on as- 
sessment of model performance. The reliability of a 
model is estimated by the frequency with which 
Monte Carlo predictions fall within a user-designated 
accuracy interval at some specified time and/or loca- 
tion. Operations on the empirical distribution of 
model outputs form the basis for decision statistics. 
For example, a model may correctly predict species 
occurrences within a 1-kilometer radius 60 percent of 
the time, given input data with 15 percent uncertainty, 
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but that reliability might increase to 75 percent with a 
reduction of the input uncertainty to 10 percent. For 
models with multiple variables, the joint reliability is 
usually not simply the product of the individual relia- 
bilities because variables are typically not independ- 
ent. For the decision maker, the utility of a model 
comes from its ease of interpretation and the degree of 
confidence that can be placed in its predictions. Relia- 
bility, therefore, is an attractive decision statistic in 
that it is easier to grasp than error measures based on 
sum of squares. Mapping model performance onto a 
binomial variable leaves no gray areas: either the 
model performed up to the decision maker's standards 
or it did not. 

For any given model, reliability is not a uniquely 
determined characteristic. Diverse combinations of ac- 
curacy and uncertainty can yield the same reliability. 
Thus, it is appropriate to construct a database of 
model performance to aid in model assessment and 
decision making. This database can then serve for 
*what if" scenario analysis. For example, to explore 
the possible consequences of data uncertainty the deci- 
sion maker specifies what magnitude of deviation is 
acceptable (accuracy interval) and what proportion of 
the time the model must provide acceptable answers 
(reliability). Similarly, to explore the consequences of 
a stringent prediction requirement (accuracy interval), 
the decision maker can specify the reliability and the 
degree of data uncertainty. The decision maker can 
thus review model performance under a variety of 
constraint scenarios. 

The routine use of Monte Carlo analyses in spa- 
tiotemporal modeling of natural resources sciences 
and management is not yet established, but the need 
has been recognized (Henebry 1995; Gascoigne and 
Wadsworth 1999; Mowrer 1999). Likewise, effective 
techniques for visualization of map error/uncertainty 
(Fotheringham et al. 1996; Beard and Buttenfield 
1999) and understanding of error propagation in GIS 
are still rudimentary but experience is rapidly increas- 
ing (Heuvelink and Goodchild 1998; Griffith et al. 
1999). However, it is significant to note that wildlife 
habitat relationship models constructed primarily 
through Boolean overlay operators on GIS data (e.g., 
Dettmers and Bart 1999) are particularly susceptible 
to complex propagation of errors in attributes and 
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locations. Spatial autocorrelation among both attrib- 
utes and errors only exacerbates the problem (Griffith 
et al. 1999). Clearly, there is need to develop new ap- 
proaches to spatiotemporal model validation. Part of 
this effort will require educating both producer and 
user communities about how to grapple intelligently 
with data and model uncertainty (Mowrer 1999). 


Data Representation 


The tools and techniques commonly used to represent 
and manipulate geospatial data carry strong assump- 
tions as to what constitutes the units of analysis. The 
phrase “units of analysis" refers not to units of spe- 
cific measurement systems but rather to the concep- 
tual entities that are subject to measurement and 
analysis. Geospatial entities include axiomatic geomet- 
ric objects (e.g., points, lines, polygons, polyhedra) 
that are located within a spatial reference system as 
well as synthetic geometric objects derived from sen- 
sor systems, such as the spatiotemporal trajectory rep- 
resenting telemetered animal movements or an array 
of pixels portraying hyperspectral radiance upwelling 
from a landscape. Fisher (1997) urged that the pixel is 
“a snare and a delusion” because it may represent an 
unobservable admixture of geospatial entities. Thus, 
the pixel does not constitute a *proper" geographic 
object and its ill-defined status often hinders analysis. 
This warning applies both to imagery per se and to its 
representation and manipulation within GIS (Crack- 
nell 1998). An issue related to the units problem is the 
now-tired debate in GIS circles over the relative merits 
of vector versus raster representations and discrete ob- 
jects versus continuous fields. Couclelis (1992) offered 
a clever conceit in her title *People manipulate objects 
(but cultivate fields)" and argues that human cogni- 
tion relies on both modes of spatial representation. It 
is prudent, therefore, to embrace the richer realm of 
multiple representational modes. This multimodality 
does then pose a challenge to the data models actually 
implemented in GIS. 

Burrough and Frank (1995), inquiring about the 
generality of GIS implementations, observed an unre- 
solved and possibly irresolvable tension between the 
universal data models that computer scientists seek 
and the ad hoc data models that GIS practitioners use 


to address specific problems. They further identified 
three major groups of GIS users: managers of defined 
objects (e.g., cadastres, utilities, facilities manage- 
ment); planners and resource managers (e.g., multiat- 
tribute evaluation and decision making); and space- 
time modelers (e.g., environmental scientists broadly 
construed). What Burrough and Frank (1995) discov- 
ered was a profound conceptual disconnect in the GIS 
community between the units of analysis and the base- 
line models employed by different disciplinary sub- 
groups. Current GIS implementations are not generic 
and they do not adequately support space-time model- 
ing (Burrough and Frank 1995; Couclelis 1999). 


Representing Time 


Inclusion of time in GIS is not as straightforward an 
exercise as might be expected. A major source of diffi- 
culty stems from how the increased dimensionality of 
the data affects what can be assumed about the data. 
Consider an unordered list, the simplest database 
structure. It is a collection of zero-dimensional data, 
database records that lack spatial or temporal rela- 
tionships with other records. Although this structure 
is easy to implement and enables efficient querying 
about the relationships between records, it can permit 
inferences about relationships that are nonsensical 
when viewed within the broader context of the data. 
Classical statistics is built around a central assumption 
of the zero-dimensionality of data and it is common 
knowledge that statistically significant but biologically 
irrelevant relationships can be obtained through the 
injudicious use of correlation analysis. 

Temporal databases introduce an explicit, unidirec- 
tional, one-dimensional structure to the data. The 
“arrow of time” makes temporally oriented queries 
and logical inferences possible (Snodgrass 1992; 
Chomicki and Toman 1998). Time-series analysis is 
the statistical analogy. Spatial databases represent spa- 
tial relationships as locations (raster/fields) and/or as 
entities (vector/objects). Although coordinate systems 
supply topology, there is no a priori ordering of the di- 
rectionality of causation in space as there is in time. 
This has the important consequence of requiring the 
user to inform the database about the flows of influ- 
ence among spatially ordered data. The user must 
specify a model of spatial relationships in order to 


make meaningful queries. For example, many GIS 
have a module that introduces the influence of gravity 
into the database topology in order to analyze 
drainage patterns. Although topological relationships 
indicate who is the neighbor of whom, additional in- 
formation is required to know who are the effective 
neighbors. Different processes can have different effec- 
tive neighborhoods or corridors at different scales. 

The addition of time in a spatial model further 
complicates the issue of influence and places more re- 
sponsibility on the user to identify relevant neigh- 
borhoods and supply meaningful ordering. Neighbor- 
hoods can be discontinuous in space and time due to 
lagged effects such as the spatiotemporal dispersal in 
seedbank dynamics and wildfires. 


Spatiotemporal Data in GIS 


According to Peuquet (1999), there are currently three 
nonexclusive modes of spatiotemporal data represen- 
tation of discrete events in GIS: location-based, entity- 
based, and time-based. Location-based representation 
uses sequential map layers with each map layer being 
a “snapshot” of the current state of spatial relation- 
ships. Aside from the potential problem of data vol- 
ume, this approach has the limitation that changes are 
not explicitly represented but instead must be calcu- 
lated from the data volume. Further, the timings of 
changes are not specified other than noting that they 
occur sometime between successive observations. This 
temporal imprecision creates problems for data analy- 
sis if the time between observations is significantly 
longer than the tempo of the activity of interest (Rat- 
cliffe and McCullagh 1998). A partial solution is to 
specify change events using a variable-length list for 
each spatial object, but the problem of calculating du- 
rations persists (Peuquet 1999). 

An entity-based representation can treat geographic 
objects (e.g., points, lines, polygons) as variables in 
time using “amended” topological vectors (Langran 
1992, 1993) or using an object-oriented approach 
(Raper and Livingstone 1996). With the former ap- 
proach, the spatiotemporal topology may quickly get 
unwieldy (Peuquet 1999). With the latter approach, 
there is the enduring problem of identifying what con- 
stitutes well-defined and well-behaved entities with all 
the attendant subtleties of scale-dependent behaviors 
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(King et al. 1991; Johnson 1996) and representation 
of discontinuous “fields” (Couclelis 1992). 

Temporal vectors record “events” in time-based 
representations generating an explicit temporal topol- 
ogy that acts as an adjunct to location or entity repre- 
sentations (Peuquet and Duan 1995; Peuquet 1999). 
There are three kinds of temporal relationships: (1) or- 
dinal and interval operations along a single temporal 
distribution of events; (2) Boolean set operations be- 
tween different event distributions; and (3) temporal 
rescaling, including extrapolation and generalization 
(Peuquet 1999). A principal difficulty of this approach 
is an implicit assumption of *omniscience" associated 
with the temporal topology. Imprecision in event dat- 
ing can lead to the loss of event duration as a tempo- 
ral metric for the event. In this case, an ordinal model 
can be used, though it is a less-powerful descriptor. In 
some cases, even temporal ordering may be unknown 
or unobservable. Fuzziness, or uncertainty about event 
initiation, duration, and termination, generates a 
fuzzy temporal topology that limits inferential power. 
Couclelis (1999) argued for consideration of relative 
spatiotemporal structures and influence of culture on 
cognition and event representation. 

Distinct from the discrete event approach, spa- 
tiotemporal data models for continuous change have 
been proposed (Erwig et al. 1999; Chomicki and 
Revesz 1999b). The model of Erwig and colleagues 
(1999) relies on developing novel abstract data types 
for spatiotemporal objects, such as moving points and 
moving regions. The model of Chomicki and Revesz 
(1999b) is based on parameterized polygons in which 
the vertices are defined using linear functions of time. 
Although this approach can model several types of 
continuous change, including movement, growing, 
and shrinking, it has other limitations and no current 
implementations. 

Many of the difficulties of incorporating time into 
geospatial databases stem from the drive to forge uni- 
versal data models for GIS implementations. Several 
definitions for spatiotemporal data types have been 
proposed (Worboys 1994; Peuquet and Duan 1995; 
Erwig et al. 1998; Chomicki and Revesz 19992; Erwig 
et al. 1999). Burrough and Frank (1995) recognized 
this tension between theory and practice and called for 
a plurality of approaches rather than a hobbled but 
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generic model. However, there is a significant lag time 
between academic technical innovations and their im- 
plementation in commercial GIS. Ultimately, there is a 
need to wed GIS with simulation models: *The essen- 
tial task is to extend a world-history model, consisting 
of observed events, objects and locations, to a process 
model that also includes interpretive occurrence rules 
and patterns expressed as combinations of relation- 
ships" (Peuquet 1994, p. 457). GIS then becomes a 
mechanism to inform simulations by serving geospa- 
tial data and then to gather, organize, and visualize the 
results. Although this approach downplays the generic 
querying functionality for the resulting spatiotemporal 
database, it emphasizes the problem-solving and fo- 
cused-decision-support aspects of modeling within a 
resource-management context. Progress has been 
made in this direction: witness the recent proceedings 
volumes on the topic of GIS and environmental mod- 
eling (Goodchild et al. 1993, 1996). 


Modeling Geospatial Data through Time 


The ecological literature is replete with approaches to 
predict species occurrences. We shall only touch on 
general points relevant to GIS. A common approach 
to spatiotemporal modeling is empirical and relies on 
opportunistic temporal sampling. Relationships are in- 
ferred from one or more slices in time, generalized, 
and then interpolated (Yang et al. 1998a) or extrapo- 
lated, as in Markov transition matrices (Hall et al. 
1991; Pastor et al. 1993; Hobbs 1994). Opportunistic 
sampling can suffer from significant, but largely hid- 
den, observational biases, as shown by research with 
such spatiotemporal anthropocentric processes as 
malaria (Schellenberg et al. 1998) and crime (Ratcliffe 
and McCullagh 1998). 

A more robust approach is to represent explicitly 
significant ecological processes and their interactions: 
resource availability and capture; organismal repro- 
duction, dispersal, and demise; and habitat hetero- 
geneity and climatic variability. The modeling may 
emphasize the temporal dynamics of resource avail- 
ability within the habitat (e.g., Weiss and Weiss 1998), 
characterize population-level behaviors (Conroy et al. 
1995; Turner et al. 1995; Radeloff et al. 1999), or 
characterize populations of individual-level behaviors 


(Shugart et al. 1992; Johnston et al. 1996; DeAngelis 
et al. 1998; Gross and DeAngelis, Chapter 40). 

An example of the prospects for integrated GIS- 
simulation modeling is the Across Trophic Level Sys- 
tem Simulation (ATLSS) that aims to predict responses 
of several higher trophic level species to various 
change scenarios in Everglades hydrology (DeAngelis 
et al. 1998; Gross and DeAngelis, Chapter 40). The 
modeling approach uses hydrologic models to drive 
habitat and resource availability models through 
changes in water depth. Individual-based species mod- 
els populate and move through the South Florida 
landscape that is articulated at the spatial resolution 
of Landsat Thematic Mapper imagery (30 meters). 
This suite of models interacts hierarchically. The spa- 
tiotemporal topology of influence is informed by hy- 
droperiod, which, in conjunction with topography, 
largely drives habitat suitability for various species 
(Gross and DeAngelis, Chapter 40). 


Practical Limits in Using Current 
Geospatial Data to Predict 
Bufo cognatus Occurrence 


Terrestrial habitats typically lack the a priori structur- 
ing of influence offered by flowing waters. Habitat 
modeling for anuran amphibians is a challenge be- 
cause of a complex life cycle (Wilbur 1980; Alford 
1999) that requires two contrasting habitats: an 
ephemeral aquatic setting for the larvae and a terres- 
trial setting for the adults. To illustrate some of the 
difficulties in predicting geospatial occurrences of 
species in this taxon, we have chosen Bufo cognatus, 
the Great Plains toad, a grasslands toad found not 
only across the Great Plains but also in moister areas 
of the arid southwestern United States and northern 
Mexico. Nocturnal in habit, it can be diurnally active 
in the breeding season and can burrow underground 
and survive several weeks to avoid dry weather (Bragg 
1940). B. cognatus overwinters in shallow terrestrial 
burrows but is not tolerant of freezing (Storey and 
Storey 1986). During the brief breeding season, 
males congregate only around shallow ephemeral 
pools, such as in bison wallows and ponding areas in 
fields, shortly after intense rains and join chorus to at- 
tract females (Bragg 1940). The ensuing scramble 


competition for females is characteristic of explosive- 
breeding anurans (Wells 1977). 

In an insightful paper, Toft (1985) reviewed the cur- 
rent herpetological literature to compare the resource 
partitioning in habitats across a wide variety of am- 
phibians and reptiles. Her resource categories fol- 
lowed those of Schoener (1974): macrohabitat, micro- 
habitat, food type, food size, diel time, and seasonal 
time. She discovered that larval amphibians differ sig- 
nificantly from other organisms in their patterns of re- 
source partitioning. Although habitat characteristics 
are key to determining habitat use for most amphib- 
ians and reptiles, seasonal time was the more critical 
resource for larval amphibians: the duration of time 
available in an ephemeral pond is limited on one end 
by the onset of the breeding season and on the other 
end by increased risk of predation and the pond dry- 
ing out (Toft 1985). Wilbur (1980) contends that com- 
plex life cycles are adaptations to exploit transient op- 
portunities for growth or dispersal. The anuran 
solution has the advantage of “being able to exploit 
the rich, but highly transient, aquatic environments 
that are seasonally available at all latitudes" (Toft 
1985, p. 11). A recent review by Alford (1999) 
stresses the significance of environmental uncertainty 
on larval growth, development, behavior, and intra- 
and interspecific interactions. 

Anuran habitat modeling is complicated by the 
fact that what constitutes a resource for the adults— 
breeding pools—is habitat for the larvae. Given the 
vagaries of the continental climate of the Great 
Plains, several years can pass before B. cognatus has 
a robust recruitment year (Bragg 1940). Data from 
the Nebraska panhandle (Nebraska climate division 
1) typify the high interannual variability in spring 
precipitation these organisms encounter (Fig. 23.1). 
An approach to species occurrences that relies prima- 
rily on the use of a land cover-vegetation alliance 
class as the primary habitat indicator (e.g., Scott et 
al. 1993) will tend to overpredict habitat extent for 
B. cognatus because it focuses only on adult habitat. 
A principal constraint in the species population dy- 
namics is the availability of larval habitat to persist 
long enough to enable metamorphosis (Wilbur 1980; 
Toft 1985; Alford 1999). Surveying across genera, 
Brodman (1998) concluded that physical geography 
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Figure 23.1. Total May precipitation in Nebraska climate divi- 
sion 1: 1895-1998. 


was a stronger predictor of amphibian distributions 
in the U.S. Midwest than either precipitation or veg- 
etation community. Corn and Peterson (1996) ar- 
gued that changes in disturbance regimes across the 
Great Plains—in particular, decreased fire frequency 
and increased extent of grazing lands—may decrease 
habitat for many amphibians and that the presence 
of specific habitat is a better indicator of occurrence 
than vegetation class surrogates. 

From a geospatial perspective, the identification of 
shallow, ephemeral pools is currently a challenge. 
Such topographic features are too fine to be detected 
using extant 7.5-minute digital elevation models. The 
small spatial extent and transitory nature of the pools 
make them elusive to most space-borne sensors, which 
have a coarse spatial resolution and long return inter- 
val relative to the phenomenon of interest. For exam- 
ple, Landsat 7 (http://landsat7.usgs.gov/), which was 
launched in April 1999, carries the ETM+ sensor that 
provides 15-meter (panchromatic) and 30-meter (mul- 
tispectral) spatial resolution at a return interval of six- 
teen days (Goward and Williams 1997). Several fine 
spatial resolution optical sensors with short return in- 
tervals are planned for over the next few years (Alpin 
et al. 1997). For example, a commercial satellite 
launched at the end of 1999, Ikonos (http://www. 
spaceimaging.com/), boasts impressive spatial resolu- 
tions for both panchromatic (1-meter) and multispec- 
tral (4-meter) imagery. 

Obscuring cloud cover is always a concern with 
optical imagery. Imaging radar may prove helpful: 
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microwaves can penetrate cloud cover and backscat- 
tering is sensitive to surface roughness. In the absence 
of significant wind-driven waves, small bodies of 
water scatter back few of the illuminating microwaves 
due to specular reflection off the relatively smooth 
water surface. Low backscattering is not, however, a 
unique signature of water: concrete pads, empty park- 
ing lots, and radar shadows can masquerade as ponds 
or ephemeral pools. Moreover, there is a trade-off be- 
tween the spatial resolution of the image and the in- 
tensity of speckle, the high spatial frequency salt-and- 
pepper noise that is inherent to imaging radar. 
Principal component analysis of image time series can 
greatly improve the signal-to-noise ratio by effectively 
trading time for space (Henebry 1997). Nevertheless, 
the finest spatial resolution currently available from 
space-borne imaging radar is 8—9 meters from the fine 
beam of RADARSAT, and its return interval remains a 
limitation. Combining multiple sensors may prove a 
useful, if costly, strategy to monitor such a variable 
spatiotemporal resource (for adults) or habitat (for 
larvae). 

A secondary constraint on B. cognatus distribution 
may relate to the spatial distribution of soils charac- 
teristics. The adults spend much of their life under- 
ground in burrows, very shallow during breeding sea- 
son and deeper during dry spells and during the winter 
(Bragg 1940). Overwintering adults must burrow 
below the soil frostline to survive. Depth to soil frost- 
line is variable with soil texture, moisture, thermal 
conductivity, and intensity and duration of subzero air 
temperatures. The interaction of the spatial distribu- 
tions of winter temperature minima, their duration 
and interannual variability on the one hand, and soil 
textural characteristics on the other, may restrict oc- 
currence of B. cognatus within the northern reaches of 
its broad latitudinal range, which extends from the 
southern ends of the Canadian prairie provinces into 
Mexico (Conant and Collins 1991). High water tables 
may exclude B. cognatus from some areas (Jones et al. 
1981) and recently metamorphosed individuals may 
seek to burrow in the softer earth of cultivated fields 
rather than in undisturbed, harder surfaces (Smith and 
Bragg 1949). As the USDA Natural Resource Conser- 
vation Service completes the county-level Soil Survey 
Geographic database (SSURGO) over the next several 


years, an important gap in geospatial data will be 
filled and modelers will have access to detailed, field- 
verified soils inventories at finer spatial resolutions 
than are currently available from the state-level 
STATSGO database (State Soil Geographic). For in- 
stance, ponding duration is a significant temporal 
habitat variable listed as available within STATSGO 
but which in actuality is blank for the Nebraska 
dataset. (Moreover, the categorical and spatial resolu- 
tion of the STATSGO data make it ill suited to this 
particular task.) 

A third constraint on B. cognatus distribution lies 
not in specific environmental limits but rather in obser- 
vational error and bias. A fossorial, nocturnal amphib- 
ian, the Great Plains toad is simply difficult to en- 
counter. Bragg (1940) notes that the burrows are 
difficult to locate once the toad is below the surface: “I 
have found but few during several years of observa- 
tion? (p. 332). In nine herpetofaunal surveys in Iowa, 
Kansas, and Nebraska since 1979, the success of en- 
countering B. cognatus is mixed. Ballinger and cowork- 
ers (1979) did encounter B. cognatus during surveys 
from 1975 to 1978 but concluded it was the least- 
abundant toad in western Nebraska. Jones and 
coworkers (1981) failed to find B. cognatus but attrib- 
uted this to the high water table at the Nebraska site. 
Lynch (1985) noted that B. cognatus were observed 
only after heavy rains during a multiple-year statewide 
survey of Nebraska. Christiansen (1981) and Chris- 
tiansen and Mabry (1985) concluded that B. cognatus 
and other toads in Iowa had maintained their ranges 
observed in a 1940 statewide survey. Two more recent 
studies in Iowa (Lannoo et al. 1994; Hemesath 1998) 
suggest that B. cognatus may even be expanding its 
range eastward; however, a more parsimonious expla- 
nation is that B. cognatus was merely overlooked in 
earlier surveys. Heinrich and Kaufman (1985) failed to 
observe the Great Plains toad at a tallgrass prairie site 
in the northern Flint Hills of Kansas but surmised that 
it was likely to occur there. Surveying a different tall- 
grass prairie site in the northern Flint Hills, Busby and 
Parmelee (1996) did not find B. cognatus in 1993. They 
noted that it was also not found in surveys from the 
1930s but had been observed in the same area prior to 
1930. It is possible that the series of droughts during 
the 1930s caused local extinction of B. cognatus. 


Blaustein et al. (1994), commenting on the per- 
ceived global decline in amphibian populations, ar- 
gued that amphibian metapopulation dynamics are 
poorly understood and that recolonization following 
local extinction events may be hindered by human 
land-use patterns. Although one-third of these surveys 
failed to encounter B. cognatus, central Oklahoma has 
hosted a robust, well-studied metapopulation for 
more than five decades (Bragg 1940; Krupka 1989). 
Of the 173 voucher specimens of B. cognatus in the 
Nebraska State Museum collected since 1969, 115 
were sampled during the 1970s, forty-four during the 
1980s, and only fifteen during the 1990s. This appar- 
ent decline may be largely attributable to reduced 
sampling effort: six of the specimens were collected 
*DOR" (dead on road) and five of those six were col- 
lected on the same day at one location. The observa- 
tional record of B. cognatus can be said to exhibit spa- 
tiotemporal patchiness, which complicates both model 
development and validation efforts. 


Conclusion 


Modeling species occurrences successfully depends ul- 
timately upon blending the spatial perspectives of bio- 
geography and landscape ecology with the temporal 
emphases of population and community ecology. 
Thus, there is a deep need to link geospatial data with 
temporal data in spatiotemporal models of metapopu- 
lation dynamics across changing landscapes. There are 
currently several interrelated limitations toward meet- 
ing this goal: 


Current Limits 


1. Current implementations of GIS are not designed 
for spatiotemporal modeling. 

2. Rescaling of data and relationships is a subtle art 
because it is largely context-specific. 

3. GIS topologies are devoid of a priori causal struc- 
tures. It is thus critical that domain experts inform 
these topologies, but this fundamental need is often 
not recognized as such. 

4. Methods to validate models persuasively remain 
an open question because theory about spatiotem- 
poral processes and patterns, whether from an 
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ecological or statistical perspective, is sparse or 
immature. 


Future Prospects 


The prospects for integrating time with geospatial 
data in models of species occurrence are good given 
developments on multiple fronts. 


GIS Diversity 


As more GIS vendors enter the market over the next 
several years, there will be opportunities to explore 
new approaches. Object-oriented GIS holds some 
promise, but there are questions about the robustness 
of dynamical representations (Raper and Livingstone 
1996). As this new technology is largely underex- 
plored, it may require different techniques for valida- 
tion and error analysis. 


Increasing Resolution and Volume of 
Remote Sensing Data Sets 


Earth resources satellites are being designed with in- 
creasingly higher spatial, spectral, radiometric, and 
temporal resolutions. Image sizes are often no longer 
expressed in megabytes but in gigabytes. In December 
1999, NASA began implementing the Earth Observing 
System (EOS) with the launch of the Terra satellite 
(http://terra.nasa.gov/), the flagship of the EOS con- 
stellation. The data produced by the EOS sensors will 
be measured in terabytes/day. The Moderate Resolu- 
tion Imaging Spectroradiometer (MODIS) on Terra, 
for example, provides coverage of the earth in thirty- 
six spectral bands at 250 meters to 1 kilometer in res- 
olution (Justice et al. 1998). Together with Landsat 7, 
Ikonos, and a host of future public and private plat- 
ENVISAT 
(http://envisat.esa.int), a veritable flood of observa- 


forms, including Europe's powerful 
tional data will rise during the first decade of the 
twenty-first century. It is critical that we be able to ac- 
cess, archive, analyze, and share very large data sets 
and that we be able to integrate EOS data with other 
remote sensing imagery and with ancillary geospatial 
data (e.g., digital elevation data, climate data, soils 
data) via GIS. The volume of data requires a paradigm 
shift in analytical strategy: from occasional observa- 
tion to monitoring, from statics to dynamics. 
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Multitemporal and Integrated 
Multisensor Analyses 


Characterization and mapping of vegetation typically 
requires at least two or three images acquired at 
critical times during the growing season to capture 
essential components of phenology and/or land- 
management practices (Loveland et al. 1995; Wolter et 
al. 1995; Congalton et al. 1998). Moreover, there is 
increasing interest in the remote-sensing community to 
assess both intra-annual (seasonal) and inter-annual 
changes in land cover (Maisongrande et al. 1995; 
Reed and Yang 1997; Egbert et al. 1998; Yang et al. 
1998b). Such work can involve acquisition, archiving, 
and processing of dozens to hundreds of images along 
with supporting ancillary geospatial data. In addition, 
it is now well established that there are significant ad- 
vantages to using data acquired by sensors operating 
in distinctly different regions of the electromagnetic 
spectrum. For example, data from microwave sensors 
provide structural information that complements re- 
flectance data gathered from visible and infrared sys- 
tems (Imhoff et al. 1997; Chauhan 1997). 

As yet more orbital platforms are sent aloft carry- 


ing both active and passive sensors with varying com- 
binations of spectral-spatial-temporal resolutions and 
off-nadir view angles, there will be more opportuni- 
ties to elicit the spatiotemporal dynamics of habitat at 
synoptic extents. Similarly, the broader use of the 
global positioning system (GPS) for tracking individ- 
ual organisms and populations will generate richer 
databases from which to study animal movement pat- 
terns across the inhabited landscape. Spatiotemporal 
modeling demands large volumes of data for model 
development, calibration, validation, and verification. 
Data and theory move hand in hand. As experience 
with collecting, handling, and analyzing spatiotempo- 
ral data accumulates, we expect to see new concepts 
and theories emerge. Modeling species occurrence 
must be an iterative process as long as the landscape 
changes. 
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MS of the association between species and 
their environment and the predictive maps 
based on these associations have been used for a va- 
riety of purposes including population estimates and 
conservation planning (e.g., Corsi et al. 1999; Jarvis 
and Robertson 1999). They provide a means of ap- 
plying existing data in a regional context in the many 
situations where further field surveys are constrained 
by time or resources (Nicholls 1989). Such models 
can be tested, and the decision about whether any 
formal testing is required and about the most appro- 
priate test should be influenced by whether the theo- 
retical content of the model or its operational capa- 
bility is more important (see Van Horne, Chapter 4; 
Rykiel 1996). In the context of land-use planning for 
conservation, validation is important and should be 
designed to test whether the model, within its do- 
main of applicability, possesses sufficient accuracy to 
be useful for its intended application (Chatfield 
1995; Rykiel 1996). 

In this chapter, we present a set of models pro- 
duced from existing government data for eight plant 
species in the Central Highlands of Victoria, Aus- 
tralia. Several of the modeling protocols were devel- 
oped for species models being used by the govern- 
ment in land-use planning, and part of the intention 
of the study was to investigate success under these 
protocols. The models were developed through four 


different modeling methods that utilize either pres- 
ence-only or presence-absence data. The aim of this 
chapter is to investigate the accuracy of the models 
at different extents (see Morrison and Hall, Chapter 
2) and to assess the performance of different valida- 
tion strategies. 


Study Area 


Land use in Australian forests is currently negotiated 
through Regional Forest Agreements (RFAs) in re- 
gions defined by government. Because we were inter- 
ested in assessing the adequacy of models that could 
be created for an RFA, modeling was based on data 
from one of these regions. The Central Highlands 
RFA area is a 1.1-million-hectare region northeast of 
Melbourne, Victoria (see Fig. 24.1 in color section). 
The landscape is dominated by the deeply dissected 
landforms of the Great Dividing Range but extends 
to a flat to undulating coastal plain in the south. The 
area ranges in elevation from 25 to 1,550 meters, in 
annual mean temperature from 5.5°C to 14.6°C, and 
in annual precipitation from 570 to 1,800 millime- 
ters. Floristic communities are diverse and form mo- 
saics throughout the region (Foreman and Walsh 
1993). Fifty-six percent of the area is public land, 
largely covered by native forest, and the remaining 
private land is mostly cleared and used for agricul- 
ture (RFA Steering Committee 1996). 
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TABLE 24.1. 


Details of eight of the modeled plant species. 


Records in Records In 
Records in Records in independent independent 
Range modeling modeling validation validation 

Species name Status? Life form (within RFA)» quadrats^ subset^ sete subset? 
Grevillea barklyana Rr shrub | 15/3507 15/444 8/383 8/100 
Oxalis magellanica r forb m 37/3485 28/381 12/379 12/66 
Tetratheca stenocarpa Rr low shrub m 54/3468 50/412 17/374 10/101 
Wittsteinia vacciniacea r low shrub m 131738941 129/280 57/334 54/24 
Helichrysum scorpioides forb (w) 204/3318 195/1459 26/365 T721 
Leptospermum grandifolium tall shrub w 179/3343 179/1661 58/333 56/105 
Nothofagus cunninghamii tree w 509/2843 509/1312 99/292 19/73 
Phebalium bilobum shrub (w) 119/3403 119/673 63/328 58/35 


?Conservation status: species that are rare (r); capitals indicate national status and lowercase indicate state classification. 
*Widespread (w), medium range (m), and localized (I); brackets indicate intermediate classes. 


cExpressed as number present/number absent. 


Species Data 


Government flora records for the region are stored dig- 
itally in the Flora Information System (FIS) of the De- 
partment for Natural Resources and Environment 
(NRE). From the full set available in September 1996, 
all quadrat records that were assessed as accurately lo- 
cated (Fiona Cross, pers. comm.) and that provided 
presence-absence data for all vascular species were ex- 
tracted. These records were the result of full quadrat 
searches commissioned since 1976. The searches were 
often purpose-driven, being part of prelogging surveys, 
regionwide studies, and targeted sampling of particular 
habitats, such as rainforests and heathlands. The most 
common sampling protocols utilized quadrats of 900 
square meters (30 x 30 meters, or 15 x 60 meters in ri- 
parian vegetation) and searching continued in a se- 
quence of dissimilar microhabitats until no new species 
had been sighted for ten minutes. The final set of 3,522 
quadrats was scattered irregularly over the region (Fig. 
24.1), with site location biased toward roadsides and 
away from ecotones. Sites tended to be clustered in 
high rainfall areas closer to Melbourne and in certain 
vegetation types (e.g., in cool temperate rainforest) 
(Elith et al. 1998). Approximately half of the quadrats 
were surveyed before 1986. The biases within this data 
set are similar to those in other sets of data compiled in 
an ad hoc way. The most likely implications for model- 
ing are that some unique environmental combinations 


will not have been sampled and models will not fit well 
in such regions. In addition, if older records apply to 
species affected by disturbance or canopy closure the 
models are likely to be less reliable in these cases. Site 
location was recorded by latitude and longitude. The 
absence of original maps for many of the quadrats 
make cross-checks of the accuracy of site location diffi- 
cult, but it is likely that 95 percent of the records were 
located within 250 meters of their true position (Elith 
et al. 1998). 

In the full study, twenty-nine species were selected 
for modeling; these were all perennial species and rep- 
resented a range of scarcities, sizes, and life forms. For 
any modeled species, the full set of 3,522 sites was 
used to document species presence or absence at each 
site. We were particularly interested in the compara- 
tive success of models for common and rare species, 
with rarity being expressed either as scarcity in the re- 
gion or as conservation status. In this chapter, the re- 
sults for eight of these species are presented; these rep- 
resent the range of results in the larger set and include 
four species classified as rare and four more-common 
species (Table 24.1). 


Environmental Data 


Models of species distribution are based on the rela- 
tionship between the presence of species, or presence 


and absence of species, and environmental data (Table 
24.2). Although models can be developed on site-spe- 
cific data, prediction for reserve planning requires ei- 
ther extensive site-specific data, which is rarely avail- 
able, or continuous environmental data across the 
region. The digital data available for this region in- 
cluded climate data estimated from long-term monthly 
records of temperature, rainfall, and radiation; topo- 
graphic data derived from a 9-second (about 250 me- 
ters) digital elevation model (DEM); and categorical 
data on rock type and vegetation class (Elith et al. 
1998; and see Table 24.3). These produced a total of 
thirty-five potential predictor variables rasterized to a 
cell size of 9 seconds in a geographic projection. These 
variables were sampled at quadrat locations for model 
development and predictions were made to all cells in 
the region. 


Modeling Methods 


Four modeling methods are presented in this chapter: 
see Table 24.2 for a summary of main references and 
example applications and software. Since our inten- 
tion was to assess the adequacy of predictions for re- 
gional planning, methods were applied in a way that 


TABLE 24.2. 
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could be adapted to many species for the whole region 
within the restrictive time frames for planning. Rare 
species were treated in the same way as common ones. 


1. ANUCLIM (Nix 1986) is a bioclimatic envelope 
method in which the climate profile for a species is 
developed by sampling the estimated climate data 
(in this case, the first twenty-three variables in 
Table 24.3) at the sites where the species is known 
to be present. The potential climatic domain for 
the species is the multidimensional envelope that 
encompasses all recorded locations of the species. 
For prediction, the climatic variables at each cell in 
the region are used to generate a rank for the cell 
that reflects its position in the climatic envelope for 
the species. In our modeling, cells were ranked 
from 0 to 3 for each species, and rankings were 
based on pairs of percentiles (0 and 100, 5 and 95, 
10 and 90). For instance, a cell outside the range of 
sampled climates would be given a rank of 0, a cell 
inside the envelope but only sitting at an extreme 
percentile such as the 97th percentile would be 
given a 1, and a cell at the core of the envelope (be- 
tween the 10th and 90th percentiles) would be 
given the highest rank. 

2. Generalized linear models (GLMs) are a class of 


Modeling methods: key references, examples, and software. 


Method Key references Example of ecological applications Software used 
Bioclimatic ANUCLIM: ANUCLIM: kauri pine in New Zealand (Mitchell 1992), ANUCLIM v1.01 
envelopes Nix (1986) eucalypts in South Africa (Richardson and McMahon 1992), (CRES 1998) 
and CRES (1998) possums (Lindenmayer et al. 1991b) and myrtle beech 
(Busby 1986) in Australia 
Generalized McCullagh and Arboreal marsupials in Australia (Lindenmayer et al. 1990a, 1990b, — S-PLUS v3.3 
linear Nelder (1989) 19912), eucalypts in Australia (1984, Austin et al. 1983, 1994, for Windows 


models (GLMs) 


Generalized 
additive 
models (GAMs) 


Genetic 
algorithms 


and Agresti (1996) 


Hastie and 
Tibshirani (1990) 


Mitchell (1996) 


1990), alpine plants in Switzerland (Guisan et al. 1998), and a 
bird species in America (Akcakaya and Atwood 1997) 


Trees in New Zealand (Yee and Mitchell 1991), eucalypts in 
Australia (Austin and Meyers 1996), and wetland plants in 
The Netherlands (Bio et al. 1998) 


GARP algorithm in Australia (Stockwell and Noble 1992) 
applied to plant species (Elith et al. 1998) 


(MathSoft 1995) 


S-PLUS v3.3 
for Windows 
(MathSoft 1995) 


GARP (Stockwell 
and Peters 1999) 


TABLE 24.3. 


Details of environmental data offered in modeling. 


Data Type? Source and format Cell Size or Scale> 


O 


Annual mean temperature 

Mean diurnal range temperature 
Maximum temperature of warmest month 
Minimum temperature of coldest month 
Temperature annual range 

Mean temperature wettest quarter 
Mean temperature driest quarter 
Mean temperature warmest quarter 
Mean temperature coldest quarter 
Annual precipitation 

Precipitation wettest month 
Precipitation driest month 
Precipitation wettest quarter 
Precipitation driest quarter 
Precipitation warmest quarter 
Precipitation coldest quarter 
Annual mean radiation 

Highest monthly radiation 

Lowest monthly radiation 

Radiation wettest quarter 

Radiation driest quarter 

Radiation warmest quarter 
Radiation coldest quarter 

Lithology 


All climate grids calculated using 9-second cell size 
(a) ANUCLIM algorithms (Nix (about 250 m) 
1986) constructed from long-term 

climate data, and 

(b) the GEODATA National 9-second 

DEM (Australian Surveying and 

Land Information Group [AUSLIG]). 

Projection was geographic. 


mul Leh O O QIO OOQ © OMS! OSO OO O OLO OLANO © 


Derived from Land Systems of Victoria 1:250000 
coverage, Dept. Natural Resources and 

Environment. Data grouped into four 

classes related to fertility and drainage. 


Geology F Derived from Surface Geology coverage, 1:250000 
Dept. Natural Resources and Environment. 
Data grouped into five classes related to 
fertility and drainage 


Elevation C 9-second DEM (AUSLIG) 9-second cell size 


Topographic position C Mean difference in elevation between current 9-second cell size 
grid cell and all cells within a 1-km radius 


Wetness index G Calculated from flow accumulation, cell width 9-second cell size 
and slope, according to Moore (1993). 


Gully F Calculated from a 1:25000 DEM, selecting cells 9-second cell size 
with low topo position at that scale and 
resampling to 250 m. 

Slope C Calculated from DEM. 9-second cell size 


Ecological vegetation class (evc) F EVC100 coverage, Dept. Natural Resources 1:100000 
and Environment. Twenty-nine vegetation classes 
in this region (Woodgate et al. 1994). 


Aspect: eastness : C Calculated from DEM. 9-second cell size 
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TABLE 24.3. (Continued) 


Details of environmental data offered in modeling. 


Data Type? Source and format Cell Size or Scaleb 
Aspect: southness C Aspect (degrees) converted to continuous 9-second cell size 
measures after Pereira and Itami (1991) and 
Zerger (1995). l 
Latitude C Created in Arcinfo 9-second cell size 
Longitude C Created in Arclnfo 9-second cell size 


aContinuous (C) and factor or categorical (F) variable. 
bCoverages converted to grids in ArcInfo by transforming to geographic projection, and gridding to 9-second cells aligned with the DEM. Values 
assigned to grid cells on the basis of the category covering the largest area of the cell. 


statistical model that encompasses classical linear 
regression and analysis of variance and extends to 
models that can be applied to categorical response 
data. Logistic regression is the appropriate type of 
GLM for presence-absence data. It uses the bino- 
mial distribution to model the variation in the re- 
sponse, and a logit link (i.e., log[u/(1 — u)] where u 
is the mean response) to transform the linear pre- 
dictor to a suitable scale for binomial data. The 
predictor can comprise one to many variables, and 
continuous variables can be modeled with a variety 
of parametric response functions (e.g., linear, quad- 
ratic, or cubic). The model can be applied to new 
data to obtain predictions of the probability of the 
species presence at each new location. In this study, 
all available environmental variables were initial 
candidates as predictors. For each species, the pre- 
dictor candidate set was reduced to a subset 
through a univariate logistic analysis, where a vari- 
able was retained for the subset if the change in de- 
viance between the model containing the variable 
and the null model was significant at a p-value of 
0.2. In the subset, variables that were correlated 
were excluded through analysis of variance infla- 
tion factors (Sokal and Rohlf 1981; Booth et al. 
1994). Models were developed with an automated 
forward selection procedure in which, at each step, 
the variable associated with the largest reduction in 
deviance was retained, and the process was re- 
peated until none of the remaining variables signif- 
icantly improved the model (p « 0.05). If latitude 
and longitude were included in the subset, they 
were considered last. Linear, quadratic, and cubic 


terms were investigated for all continuous vari- 
ables, but interactions were not considered. 


. Generalized additive models (GAMs) are an exten- 


sion of GLMs and allow nonparametric, nonlinear 
data-dependent smooth functions to be fitted to 
continuous variables in the predictor. This means 
that a functional form does not have to be specified 
(cf. GLMs), giving greater flexibility to the model- 
ing process. The final model can include both lin- 
ear and smoothed terms for continuous variables 
and categorical variables. Variable selection for 
GAMs was approached in the same manner as for 
GLMs. GAMs were developed with four degrees of 
freedom initially assigned to each continuous vari- 
able, with subsequent testing of changes in de- 
viance used to reduce the degrees of freedom where 
possible. Relationships were smoothed with cubic 
splines. 


. GARP (Genetic Algorithm for Rule-set Prediction; 


see Stockwell and Peterson, Chapter 48) is a com- 
puter-based modeling approach that uses a genetic 
algorithm to develop a set of rules that describe the 
relationship between species data (presence or 
presence-absence) and environmental variables. 
The rules are then applied to all other cells of inter- 
est for prediction. The genetic analogy is intended 
to express the idea of an evolving set of rules, from 
the initial random starting point to one optimally 
fit for prediction. The resulting predictions express 
the relative likelihood of the presence of the species 
at any given site, with sites of presence always as- 
signed the maximum likelihood. Models developed 
in GARP were constructed with the full set of 
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Observed 


presence | absence 
presence | 


absence 


Predicted 


Figure 24.2. Representation of a confusion matrix. 


environmental variables. Since each run of the ge- 
netic algorithm provides a unique solution, and 
maps of the resulting predictions were demonstra- 
bly different even with a quick visual assessment, 
spatial predictions were constructed from the aver- 
age of ten iterations of the program. 


Model Assessment 


Measurement of accuracy. Our primary concern was 
to produce models of species distribution that discrim- 
inated well between sites where the species was pres- 
ent and sites where it was absent, so an appropriate 
measure of model discrimination was required. If 
presence-absence data are available, the performances 
of models are often summarized in a confusion or 
error matrix (Fielding, Chapter 21) that records the 
counts of observed presence and absence against pre- 
dicted presence and absence (Fig. 24.2). 

Two common measures derived from the confusion 
matrix are the true positive fraction (TPF) or sensitiv- 
ity (a/[a + c] in Fig. 24.2) and the false positive frac- 
tion (FPF, or [1 — specificity], or in other words b/[b + 
d]). Construction of the confusion matrix requires 
that a threshold is selected so that the predictions 
(probability, likelihood, or rank) can be converted into 
binary data. The selection of such a threshold is not 
necessarily a straightforward task (Fielding and Bell 
1997), particularly for rare species, and excludes some 
of the information generated by the model. 

One threshold-independent measure that is used 
regularly in clinical medicine is the receiver (or rela- 
tive) operating characteristic (ROC) curve (Metz 
1978; Swets 1928). This plots the TPF against the FPF 


over a large range of thresholds (see Fig. 24.2). The 
area under the curve (AUC) is an important summary 
statistic for ROC plots. The AUC ranges from 0 to 1, 
with values less than 0.5 indicating that the model 
tends to predict presence at sites at which the species 
is, in fact, absent. A value of 0.5 implies no discrimi- 
nation (i.e., no consistent difference between predic- 
tions from the observed absence group and the ob- 
served presence group), which is equivalent to random 
predictions, and 1.0 indicates perfect discrimination. 
For binary data, the AUC is identical to the nonpara- 
metric two-sample Mann-Whitney statistic (Hanley 
and McNeil 1982). An AUC of 0.78 can be under- 
stood to mean that, if random pairs of predictions 
were sampled from observed presence and observed 
absence groups, the prediction for the observed pres- 
ence would be higher than the prediction for the ob- 
served absence 78 percent of the time. Since the pre- 
dictions are intended for use in ranking the suitability 
of land for inclusion in reserves, this statistic is an ap- 
propriate measure of model usefulness. 

In this study, models with AUCs of 0.75 or above 
will be considered sufficiently discriminatory to be po- 
tentially helpful (see Pearce et al., Chapter 32). We 
calculated nonparametric AUC and DeLong standard 
errors (DeLong et al. 1988) with routines written for 
use in the S-Plus statistical program (MathSoft 1995; 
Elizabeth Atkinson personal communication). All re- 
sults are presented graphically with confidence inter- 
vals based on the standard errors for an individual 
AUC estimate. However, where modeling methods are 
tested on a common validation set, the ROC curves 
are correlated and the unadjusted standard errors are 
less sensitive than they should be in detecting differ- 
ences between the AUCs (Hanley and McNeil 1983). 
In these cases, the results presented in the text are 
based on tests of significance developed by DeLong et 
al. (1988), which take into account the correlation of 
the ROC curves. 

Validation data sets. 'To investigate the effect of the 
validation set on the conclusions about model discrim- 
ination two sets of data have been used: 


1. The original data used to develop the model was 
also used to test discrimination. Although it is well 
documented that this does not represent true vali- 


dation (Fielding and Bell 1997; Chatfield 1995), it 
is still regularly used and published in government 
and scientific circles (e.g., Lindenmayer et al. 
. 1991b; Skidmore et al. 1996; NSW NPWS 1998; 
Franklin 1998). In this chapter, AUCs calculated 
on this data set will be referred to as AUC podel- 
. Àn independent set of species data was collected in 
1997 specifically for validation of the models for 
these species. Our intention was to visit sites that 
spanned the range of predictions for each method 
for each species and to gather a reasonable number 
of presence records. The selection of the first one 
hundred cells was based on expert opinion of likely 
sites for each of the scarcer species. This provided 
records independent of the models we had created 
and was intended to supply sites that may include 
rare species if the models failed. In order to select 
sites predicted by the models to be very likely to 
contain the species, the final set was required to in- 
clude at least three cells for each species/method 
combination from the top 0.1 percent of the distri- 
bution of predictions. If the set did not already 
contain such cells, they were added through a ran- 
dom selection from the highest predictions. Cells 
were not visited if a species had previously been 
found there. In all cases, the lower range of predic- 
tions was well represented by sites selected for high 
likelihood for other species. The final validation set 
contained 391 sites. These were located by latitude 
and longitude at the center of the selected cell (all 
methods predicted presence or absence within 
9-second cells). In the field, the locations were con- 
verted to Australian Map Grid (AMG) projection 
and sites located on 1:100,000 topographic maps, 
or 1:25,000 maps where there was insufficient de- 
tail on smaller-scale maps to locate the site reliably. 
The site was sampled at its given location regard- 
less of whether other mapped data (such as 
mapped vegetation class) matched the actual situa- 
tion. Presence-absence data was recorded both 
within a 900-square-meter quadrat (consistent 
with the protocol for the modeling data) and 
within the 9-second (about 250 meters) cell (Elith 
et al. 1998), and in this chapter, data for the pres- 
ence-absence of the species within the 9-second cell 
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are presented. AUCs calculated on this data set will 
be referred to as AUCindep in the results. 


Validation scale. We were concerned that, even 
though the models were being developed at the re- 
gional scale (about 106 hectares), they would be ap- 
plied to decision making within a subcatchment 
(about 104 to 105 hectares). A subcatchment is consid- 
ered a different scale compared with a whole region 
because it has a smaller extent. The data sets have 
been used for model assessment in their entirety (i.e., 
at the regional scale) and as subsets that represent the 
subcatchment scale for each species. For a subset, the 
locations of the species in the full modeling set (set 1, 
above) were taken to indicate the geographic extent of 
the species in the RFA, and bounds were defined that 
enclosed most presence locations and extended to the 
estimated limits of the inhabited subcatchments. The 
subcatchment limits were defined through a visual in- 
terpretation of the region's DEM because we did not 
have the GIS capacity to do otherwise at the time; sub- 
sequently, a more-detailed definition of the subcatch- 
ments created through ArcInfo algorithms has indi- 
cated that, even though the validation subsets may 
differ slightly with an automated approach, the con- 
clusions from the study are similar regardless of which 
limits were used. In summary, the subsets exclude sites 
that are in subcatchments that have no other record of 
species presence. 


Comparison of Modeling Success: 
Data Sets 


At the regional scale, the modeling data set (set 1) re- 
turned AUCyodel greater than 0.85 for 91 percent of 
the models and greater than 0.75 for 94 percent of the 
models. By comparison, observations from the inde- 
pendent validation set (set 2) returned AUCindep 
greater than 0.85 for 69 percent of the models and 
greater than 0.75 for 88 percent of them (Fig. 24.3). 
The general trend was for AUC ode to be higher than 
AUCindep- In most cases, the difference between 
AUC mode! and AUCindep was relatively small, with 81 
percent of the values for AUCindep being within 10 
percent of the values for AUC ode. Thirty-one percent 
of the differences were significant at the 5 percent 
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Figure 24.3. Area Under Curve (AUC) summaries of the re- 
ceiver operating characteristic curves for eight species (indi- 
cated by initials of their scientific names), where predictions 
are tested against the modeling data set (black plots) and the 
independent data set (gray plots) at the regional scale. Plots 
show the AUC and its 95 percent confidence intervals for each 
species/method combination. 


level, and in 80 percent of these cases AUC model was 
higher than AUCindep (Fig. 24.3). These significant dif- 
ferences were restricted to four of the eight species, 
and the most marked reductions in AUC (AUC model to 
AUCindep) were centered on two species, (Tetratheca 
stenocarpa and Helichrysum scorpioides). 
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Figure 24.4. Area Under Curve (AUC) summaries of the re- 
ceiver operating characteristic curves for eight species (indi- 
cated by initials of their scientific names), where predictions 
are tested against the modeling data set at the regional scale 
(solid black marks) and the subcatchment scale (open black 
marks). Plots show the AUC and its 95 percent confidence in- 
tervals for each species/method combination. 
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Figure 24.5. Area Under Curve (AUC) summaries of the re- 
ceiver operating characteristic curves for eight species (indi- 
cated by initials of their scientific names), where predictions 
are tested against the modeling data set (black plots) and the 
independent data set (gray plots) at the subcatchment scale. 
Plots show the AUC and its 95 percent confidence intervals for 
each species/method combination. 


AUCs were always lower at the subcatchment ex- 
tent than at the regional extent (Fig. 24.4). In the sub- 
catchments, 66 percent of the models had AUC,model 
greater than 0.75, and only 34 percent of the models 
had AUCindep greater than 0.75 (Fig. 24.5). The over- 
whelming trend was for AUCmodel to be higher than 
AUCindep, and the difference between the AUCs 
tended to be greater than at the regional extent, with 
only 56 percent of subcatchment values for AUG, 
being within 10 percent of the values for AUC model 
(Fig. 24.5). However, only 22 percent of these differ- 
ences were significant at the 5 percent level. 


Comparison of Modeling Success: 
Methods and Species 


The AUC statistics derived from the independent vali- 
dation set at the subcatchment scale have been prima- 
rily used to study differences in model performance 
between methods and between species since these data 
are more likely to represent the true modeling success 
at a scale appropriate for the intended application of 
these methods. The results for the eight species re- 
ported here are qualitatively the same as for the full 
set of species considered in the larger study (Elith et al. 
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Figure 24.6. Area Under Curve (AUC) summaries of the receiver operating characteristic curves for four modeling methods, where 
predictions are tested against the modeling data set (a) and the independent data set (b) at the subcatchment scale. Plots show 
the AUC and its 95 percent confidence intervals, with results for each species grouped by method, and species in the same order 


as previous figures. 


1998). They indicate no consistent significant differ- 
ence among the discriminatory performance of predic- 
tions from the four different methods, although there 
is an apparent trend toward better discrimination 
from the GAMs and GLMs. This coincides with a ten- 
dency for the methods that use presence-only data 
(ANUCLIM in this study) to perform less well than 
the presence-absence models (Fig. 24.6). Significance 
tests that take into account the correlation of the 
AUCs indicate that there is no evidence that the AUCs 
are not equal for six of the species (p € 0.05). For the 
remaining two species (H. scorpioides and Leptosper- 
mum grandifolium) there is evidence of inequality, 
with tests of pairs of methods showing that the GAM 
for L. grandifolium and the GAM and GLM for H. 
scorpioides have significantly better discrimination 
than models produced with the other methods. 

The difference in modeling success between species 
tended to be more pronounced than the differences be- 
tween modeling methods (Fig. 24.5). Some species 
such as Grevillea barklyana and L. grandifolium ap- 
peared to be modeled with sufficient accuracy for the 
information to be useful in land-use decisions, 
whereas the models for other species such as 
Wittsteinia vacciniacea and Phebalium bilobum were 


unlikely to be useful. There were no clear associations 
between modeling success and species’ characteristics 
such as rarity, range within the region, or life-form. 


Discussion 


It is clear that our perception of whether a model is 
accurate enough to be useful is affected by the way in 
which we assess it (see Van Horne, Chapter 4). The re- 
sults of this study show that use of the modeling data 
(i.e., the data used to build the model) as a validation 
set gives an overly optimistic view of modeling suc- 
cess. The result is not surprising and is consistent with 
the reviews of model uncertainty and its assessment by 
Fielding and Bell (1997) and Chatfield (1995). In this 


' study, the degree of optimism displayed by the model- 


ing data set varied with the species, the modeling 
method, and the extent of analysis. 

At the regional scale, the most marked overopti- 
mism was focused on two species, T. stenocarpa and 
H. scorpioides. A distinctive feature of the presence 
records for these two species was that they were both 
found, in the independent surveys, at sites relatively 
distant (up to 55 kilometers) from any modeling 
records. These are unlikely to be new occurrences of 
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the species and indicate that either the surveys con- 
tributing to the modeling data did not adequately 
sample the region, missing unique and important envi- 
ronmental combinations, or that the species were 
present but not correctly identified (this is particularly 
possible for T. stenocarpa, which is difficult to distin- 
guish from T. ciliata in some of its forms [D. Frood, 
personal communication]). The result supports the 
common-sense notion that models built on inaccurate 
or unrepresentative records will not predict reliably to 
new sites. Targeted sampling that filled in *gaps" or 
deficiencies in data sets (as assessed by geographic or 
environmental representativeness, by date of survey, 
or by expert knowledge) would increase model accu- 
racy in these situations. 

Some modeling methods appear particularly (and 
unjustifiably) effective when the modeling data set is 
used for model assessment. Methods that create a rule 
or measure an environmental distance (such as GARP 
in this study) often predict highest likelihood at the 
original record site. In contrast, methods that fit a pre- 
diction surface (such as the GLMs and GAMs) do not 
necessarily predict high probabilities at presence loca- 
tions, especially if identical environmental conditions 
are also associated with absence records. Any method 
that by default or by consequence of the applied meas- 
ures predicts highest likelihoods of occurrence at the 
modeling presence sites will appear to have good dis- 
crimination when tested against the modeling data set. 
This can be seen in the data presented in Fig. 24.6a 
and 24.6b. GARP appears to perform significantly 
better than the other three methods when tested 
against the original data (Fig. 24.6a), but testing with 
independent data (Fig. 24.6b) indicates no clear dis- 
tinction between the methods. The only accurate way 
to test models developed with methods that return 
highest likelihoods at presence sites is to ensure that 
the validation data set is independent. If a completely 
independent data set is not available, data partitioning 
that excludes some data from the modeling process 
(see, for example, Fielding, Chapter 21) can achieve a 
degree of independence that will result in a more real- 
istic picture of model performance. 

The same patterns in measurement optimism were 
apparent at the subcatchment scale; differences 
tended to be relatively greater but less often signifi- 


cant. The significance is affected by the standard er- 
rors for these statistics, which increase with decreas- 
ing AUC and with increasing imbalance between 
number of presence records and number of absence 
records (Hanley and McNeil 1982). Therefore, the 
confidence limits at this smaller extent tend to be rel- 
atively large, especially for rare species. The clearest 
effect of validation extent is demonstrated in Figure 
24.4, which shows that, even for the optimistic tests 
with the modeling data set, a model may have insuf- 
ficient discrimination to be useful at a subcatchment 
scale even though it appears to discriminate well at a 
regional scale (e.g., see W. vacciniacea, P. bilobum). 
The effect of scale will not be the same in every 
region—it is likely to vary with the homogeneity of 
the landscape, the match between the environmental 
patterns and the grain of the mapped environmental 
variables, and the intensity of sampling. Neverthe- 
less, it is important to consider the scales most rele- 
vant to the context and objectives of the modeling 
and to give attention to assessing the models at a re- 
alistic grain and extent. 

The differences in model performance between 
methods tended to be less marked than the differ- 
ences between species. As assessed with the inde- 
pendent validation set at the subcatchment scale, 
the trend was for the presence-only method 
(ANUCLIM), which only uses climatic variables for 
prediction, to perform less well than the other meth- 
ods, though the differences were generally not signif- 
icant. Other studies that have compared methods 
have generally concluded that the statistical methods 
(GLMs and GAMs) perform better than ANUCLIM 
(Ferrier and Watson 1996) and that GAMs may 
model species responses to habitat more successfully 
than GLMs (Austin and Meyers 1996; Bio et al. 
1998). It is likely that, given a more complete and 
more accurate modeling data set and more extensive 
validation data, significant differences between the 
methods would emerge, on the basis that we would 
expect the added refinement provided by absence 
data and additional predictor variables to improve 
modeling success. 

Some species appear to be well modeled at the sub- 
catchment scale, and the success of these models sug- 
gests that it is worth putting effort into modeling 


species distributions, even for rare species (see Karl et 
al, Chapter 51). The models presented in this study 
were constructed using approaches that were likely to 
be applied in the RFAs, but a number of refinements 
are possible. Filtering the presence-absence data to re- 
duce spatial autocorrelation (Mackey et al. unpub- 
lished) or dealing with the autocorrelation within the 
model (Augustin et al. 1996; Fielding and Bell 1997) 
may improve the models. The region includes a di- 
verse range of landscapes, and constraining the set of 
modeling data to exclude absence data outside the en- 
vironmental range of the species may be effective 
(Austin and Meyers 1996), although determination of 
the appropriate range is not straightforward when the 
data set is not comprehensive. Other refinements 
could focus on additional sampling to overcome data 
inadequacies (Neldner et al. 1995), the development 
of variables more directly related to species distribu- 
tion (Austin et al. 1984), and alternative approaches 
to variable selection for any of the methods (e.g., 
Booth et al. 1994). 

Since we cannot identify particular characteristics 
of species or their data sets that can be used as a 
guide to the likely success of modeling, there remains 
a pressing imperative to assess models carefully be- 
fore using them in decision making. Managers inter- 
ested in using predictions to discriminate between 
sites may not have a choice about how the predic- 
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tions were constructed, but an understanding of the 
general tendencies of the method will help interpreta- 
tion of results. For example, envelope methods (such 
as ANUCLIM) tend to include in the envelope sites 
of absence, whereas GLMs theoretically predict ac- 
curate probabilities of presence. Two issues of practi- 
cal importance highlighted by our results that can be 
assessed by a manager are (1) At what scale is the 
model likely to be effective and will it discriminate 
between sites at the grain (resolution) required? (2) 
How has the model been tested? If it has not been 
tested with independent data at the grain and extent 
that is relevant to the application, the results need to 
be assessed with care. 
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CHAPTER 
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Semiquantitative Response Models 
for Predicting the Spatial Distribution 


of Plant Species 


Antoine Guisan 


-— to Zar (1996), four main types of data 
can be found in biology: (1) data on a ratio, (2) 
on an interval, (3) on an ordinal (or ordered), and (4) 
on a nominal scale. The first two data types, which 
can be further divided into continuous or discrete 
data, are commonly grouped under the denomination 
of quantitative data; ordinal data are synonymously 
called semiquantitative data, and nominal data can 
also be called qualitative data. To date, statistical 
model building in ecology has mostly focused on 
quantitative (e.g., biomass) and qualitative (e.g., vege- 
tation units) responses. However, little attention has 
been paid to semiquantitative, ordinal responses. 

An ordinal scale is defined as an ordering of meas- 
urements, with only relative—instead of quantitative— 
differences between values. For instance, measure- 
ments of biological entities in classes that are, for 
instance, longer, larger, or more abundant than others 
are clearly ordinal. Ordinal data can also originate 
from ratio or interval data by slicing a continuous 
scale, although such procedure inexorably results in a 
loss of important ecological information. On the other 
hand, some so-called quantitative data—such as 
Braun-Blanquet (1964) and other abundance-domi- 
nance scales in phytosociology (see Table 25.1)—are 
best considered ordinal in statistical analyses. Among 
other ordinal outcomes commonly found in ecology 
are successive phenological stages of plant-flowering 


processes (e.g., Theurillat and Schlüssel 1998) or of in- 
sect development (e.g., Candy 1991; Manel and De- 
bouzie 1997), categories of tree height or diameter in 
forestry (e.g., Schabenberger 1995), or tolerance levels 
in ecotoxicology (e.g., of an organism to a pollutant; 
e.g., Ashford 1959). More generally, ecological data 
might certainly be best analyzed as ordinal where one 
expects the uncertainty in field measurement to be 
over an acceptable threshold (e.g., visual abundance 
classes of animals). 

Statistical methods for ordered data have been used 
since the late fifties (e.g., Ashford 1959). Multiple re- 
gression models for an ordinal response were first de- 
scribed in the late sixties (e.g., Walker and Duncan 
1967; see McCullagh 1980 or Anderson 1984 for 
early reviews on this matter). Since then, ordinal re- 
gression models have been included in the broader cat- 
egory of generalized linear models (GLMs; see McCul- 
lagh and Nelder 1989) and were principally used in 
epidemic studies, toxicity assessments (bioassays; e.g., 
Ashford 1959) and social sciences, as these disciplines 
often have to deal with semiquantitative variables. But 
few examples of ordinal regression models currently 
exist in ecology (see, for example, Schabenberger 
1995; Guisan et al. 1998), even though in this field, 
several data are actually on an ordinal scale. 

In this chapter, I present one regression model—the 
Proportional Odds Model—which can be successfully 
applied to an ordinal response. The use of this model 
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TABLE 25.1. 


The ordinal scale used in the two case studies ranges from O to 5. These ordered values correspond to intervals 
of unequal widths of plant ground cover density. Correspondence is also given with previous scales commonly 


used in phytosociology (from Guisan 1997). 


Classe Cover (%) Braun-Blanquet (1964) Barkman et al. (1964) — van der Maarel (1979) 
0 O — — (0) 
1 1-5 al r, elem 1,243274 
2 6-12 2a 5 
3 13-25 2b 6 
4 26-50 3 3 i 
5 51-100 4,5 4,5 8,9 


is illustrated through case studies of modeling spatial 
distribution of plant species, together with their ordi- 
nal density on the ground, with data from two moun- 
tain ranges: a small study area in the Swiss Alps (see 
Guisan et al. 1998) and the entire range of the Spring 
mountains in Nevada (see Guisan et al. 1999). 

Specific objectives were (1) to set up ordinal regres- 
sion models for the distribution of plant species’ cover 
density, (2) to implement these models in a geographic 
information system (GIS), and (3) to assess the ade- 
quacy of ordinal model predictions in a general con- 
text, in other words, giving no greater weight to any 
one type of prediction success or of prediction error 
than another, and to compare them to predictions 
from (1) Gaussian and Poisson models applied to the 
ordinal response, and (2) logistic presence-absence 
models. 


Methods 


The following examples are taken from two study 
areas—the Belalp area in Switzerland and the Spring 
Mountains in Nevada, USA (see Fig. 25.1)—described 
respectively in Guisan et al. (1998) and Guisan et al. 
(1999). 


Study Areas 


The study area of Belalp is a wide, open, north- 
south-oriented side valley of the Rhone Valley, located 
in the Aletsch region (Valais, Switzerland; Fig. 25.1a). 
Elevation ranges between 1,867 and 3,554 meters. 
Geology is mainly siliceous (gneiss, granite). The cli- 
mate is subcontinental. Soils are mainly of a podzolic 


type. The upper subalpine vegetation is mainly domi- 
nated by mesophilous heaths, swards, and fens. The 
alpine vegetation belt, ranging from 2,300 to 3,000 
meters, is dominated by low heaths, swards, and 
snowbed communities. The landscape has been modi- 
fied by human activity for centuries through intensive 
grazing by cattle, sheep, and goats, the main effect 
being the lowering of the timberline by several hun- 
dred meters. At present, grazing is extensive. 

The Spring Mountains are located in southern 
Nevada, 20 kilometers west of Las Vegas, at latitude 
36°15’ N and longitude 115°45’ W (Fig. 25.1b). Ris- 
ing out of the Mojave Desert at 700 meters, the Spring 
Mountains reach an elevation of 3,600 meters within 
a distance of 10 kilometers. Plant communities range 
from Mojave Desert shrub at the base, through Joshua 
tree woodland, sagebrush, pinion/juniper woodland, 
and a variety of mixed conifers and aspen at the upper 
elevations. The highest region supports limber (Pinus 
flexilis) and bristlecone pine (Pinus longaeva), with a 
small area of alpine tundra at the peak. Deep canyons 
radiate from the highest peaks, creating complex mo- 
saics of vegetation across solar radiation gradients on 
opposing slopes. Natural and anthropogenic fires and 
disturbances have added to the complexity of the veg- 
etation mosaic. 


The Ordinal Scale 


The ordinal scale used in the following case study is 
derived from the modified Braun-Blanquet (1964) 
abundance/dominance scale (Barkman et al. 1964; 
Table 25.1). In phytosociological studies, data are still 
sampled by attributing to each plant species in a relevé 
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Figure 25.1. Study areas: (a) The Belalp area, near the Aletsch Glacier in the Swiss Alps. (b) The Spring Mountains, near Las Vegas 


in Nevada. 


one of several predefinite classes of percentage ground 
cover. The Barkman scale is clearly more semiquanti- 
tative (i.e., ordinal) than truly quantitative. It takes 
values 0, r, +, 1, 2m, 2a, 2b, 3, 4b or 5 to characterize 
vegetation abundance/dominance (Table 25.1), which 
correspond respectively to no cover, rare species, some 
individuals present, less than 5 percent cover, less than 
5 percent cover but abundant, 6-12 percent, 13-25 
percent, 26-50 percent, 51-75 percent, and 76-100 
percent cover. In most studies, this scale is trans- 
formed into a scale of linear integer values ranging 
from 0 to 9 (e.g., van der Maarel [1979] scale; Table 
25.1) and is then considered a quantitative variable if 
calculations are to be made (see Jongman et al. 1987), 
using, for example, a least-square (LS) model. Al- 
though the predictions from such models might be ac- 
ceptable, there are four main limitations to using such 
a procedure (Guisan and Harrell 2000): 


1. On a quantitative-ratio scale, as required by, for in- 
stance, an LS model, the differences between suc- 
cessive ordered categories (e.g., between 1 and 2 
and between 5 and 6) are assumed to have the 
same meaning, although actually they have not (see 
the scale above). If quantitative comparisons are 
made on such a linearized scale, ratios would be 
used to calculate estimated coefficients, thus lead- 
ing to possible misinterpretation, because the scale 
actually represents intervals on an arbitrary unlin- 


ear scale. 


2. Linearizing the abundance/dominance scale might 


be considered a similar process to log-linear trans- 
formation, which was used, before the appearance 
of GLM, when attempting to retrieve a normal 
error distribution. In that case, the statistical model 


should be based on a log-normal distribution 
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rather than on the normal distribution considered 
in LS models. 

3. Modeling an ordered categorical response vari- 
able—which is discrete with few values, including 
floor and ceiling—using a continuous probability 
model, was shown by Snell (1964) to cause possi- 
ble problems. 

4. Unless an appropriate link function is used (i.e., 
other than unity or logarithm, like logit), predic- 
tions from such quantitative models may possibly 
take values much higher or much lower (e.g., nega- 
tive values) than the maximum or minimum theo- 
retical ordinal class. This is unacceptable from 
both an ecological and a methodological stand- 
point. 


It would be better to consider each class to be sim- 
ply higher or lower than the adjacent class, depending 
on its position along the ordinal scale. This implies fit- 
ting a regression model for ordinal response, as pro- 
posed by Schabenberger (1995), Manel and Debouzie 
(1997) or Guisan et al. (1998; see Guisan and Harrell 
2000). In the following, I will use the term cover den- 
sity rather than cover abundance to follow the termi- 
nology proposed by Morrison and Hall (Chapter 2). 


Sampling the Response Variable 


In the Belalp study area, the two species data sets used 
for calibrating and evaluating the model (Fig. 25.1a) 
are the same as described in Guisan et al. (1998). Cal- 
ibration points of 4 square meters (N - 205) were 
sampled following a grid sampling scheme, meaning 
at all intersection points of a 250x250-meter grid 
overlaying the whole study area. Sampling on this 
scale means that autocorrelation is avoided (see 
Guisan 1997) and ensures that significance tests for se- 
lecting predictors remain valid (Palmer and Van der 
Maarel 1995; Van der Maarel et al. 1995). A set of in- 
dependent points (N = 92) were later sampled ran- 
domly from a 25x25-meter grid overlaying the study 
area, for the evaluation of model predictions. At each 
point, localized in the field by means of a Garmin GPS 
navigator, a map accurate to a scale of 1:10,000 and a 
Thommen altimeter, ocular estimates of ground cover 
density were assessed for each species according to the 
Barkman abundance-dominance scale (Table 25.1). 


They were later reclassified into the ordinal scale de- 
scribed in Table 25.1 for fitting the models. 

In the Spring Mountains, a data set of 230 plots 
(generally 20x20 meters), including all higher plant 
species (upland vegetation only, with associated ocular 
estimates of ground cover density), was sampled from 
a 30x30-meter grid overlaying the study area, accord- 
ing to an ad hoc stratified design (see Guisan et al. 
1999) by The Nature Conservancy (TNC; Nachlinger 
and Reese 1996). Each point was localized in the field 
using a Trimble Geoexplorer GPS navigator with post- 
processed differential corrections. The original single 
data set was split into two subsets (Fig. 25.1b), one 
for calibrating the model (training data set) and one 
for evaluating the model predictions (evaluation data 
set). As for the Belalp data, ocular estimates of 
ground-cover density were reclassified into the ordinal 
scale described in Table 25.1. 

In both cases, ordinal models were only fitted for 
those species showing a sufficient variation in ground- 
cover density over the training data set. As a criterion, 
species’ density was modeled for a species only when 
at least three classes of ground-cover density could be 
recorded for it, with each class recorded in a minimum 
of five plots. 


Environmental Predictors 


Environmental descriptors—hereafter called predic- 
tors—used to model species distribution in the Belalp 
area were obtained from two main sources: (1) a dig- 
ital elevation model (DEM), obtained from the Swiss 
Federal Office of Topography (high reliability), or (2) 
remotely sensed data (black-and-white and color in- 
frared aerial photographs) scanned and rectified at a 
1x1-meter grain (resolution). All predictors were cal- 
culated (DEM-related predictors) or aggregated (re- 
mote sensing data) to the 25x25-meter grain of the 
DEM. Aggregation was performed using nearest- 
neighbor assignment. Some of these predictors are de- 
scribed in more detail in Guisan et al. (1998) and can 
be summarized as follows: (1) annual mean tempera- 
ture (amt, calculated from elevation using a field-cali- 
brated transition formula), (2) slope angle (slo), (3) 
four indices of topographic positions (representing a 
gradient from ridge top to middle slope to valley, and 
calculated with different moving windows with re- 


spective radii of 125, 250, 500, and 1,000 meters (tp5, 
tp10, tp20, and tp40), (4) two indices of solar radia- 
tion (rad1 and rad2), obtained by taking the first two 
axes of a principal component analysis [PCA; explain- 
ing 99 percent of the variance] made on nineteen indi- 
vidual daily sums of solar radiation taken every tenth 
day from early April to late August), (5) two indices of 
snow cover (obtained by combining respectively two 
and four aerial photographs taken at regular intervals 
during 1996 and 1997: snowi96 and snowi97), (6) the 
three bands of the color infrared aerial photograph 
(cir-1 to cir-3), and (7) the potential permafrost (perm, 
modeled with the PERMAKART model; Keller 1992). 

However, predictor variables used for explaining 
species distribution in the Spring Mountains were all 
terrain related, derived from the 30x30-meter DEM 
(see Guisan et al. 1999 for more details on their calcu- 
lation). They were (1) elevation (elev), (2) slope angle 
(slo), (3) northness (ess) and eastness (eness), calcu- 
lated respectively as the sine and cosine of aspect (in 
degrees), (4) summer solstice and spring equinox inso- 
lation calculated using the SOLARFLUX model (Het- 
rick et al. 1993; ssol and esol), and (5) four topo- 
graphic position indices calculated at the different 
smoothing levels: 150 meters, 300 meters, 1,000 me- 
ters, and 2,000 meters (tp150 to tp2000). 

In both studies, predictors were standardized (by 
removing the mean and dividing by 1 standard devia- 
tion), prior to fitting the models, and stored in the 
GIS. 


Statistical Models 


In recent years, predictive distribution modeling (see 
Franklin 1995 or Guisan and Zimmermann 2000 for 
a review) has been broadly applied worldwide to pre- 
dict the distribution of plants (e.g., Lehmann 1998; 
Leathwick 1998; Franklin 1998; Guisan et al. 1998, 
1999; Zimmermann and Kienast 1999) and animals 
(e.g., Aspinall 1992; Augustin et al. 1996; Mladenoff 
and Baker 1999), species and communities. Despite its 
intrinsic limitations (pseudoequilibrium assumption, 
no temporal dimension), this approach is considered a 
powerful and rapid way to assess the possible impact 
of a climatic change on the distribution and abun- 
dance of plant species (e.g., Lischke et al. 1998; 
Guisan and Theurillat 2000), to test biogeographic 


25.Semiquantitative Response Models 319 


hypotheses (Mourell and Ezcurra 1996; Leathwick 
1998), and for studies in conservation biology (see the 
many examples in this volume). 

Generalized linear models (GLMs; McCullagh and 
Nelder 1989) are particularly appropriate for such 
predictive modeling, as attested to by the numerous 
papers published in this field in recent years (see 
Guisan and Zimmermann 2000, and see several chap- 
ters in this volume, including McKenney et al., Chap- 
ter 31; Pearce et al., Chapter 32; Jones et al., Chapter 
35; Klute et al., Chapter 27; Fertig and Reiners, Chap- 
ter 42; and Vernier et al., Chapter 50). GLMs are an 
extension of classical linear models. In GLMs, the 
combination of predictor variables xi (i = 1,..., p) 
produces a linear predictor LP, which is related to the 
expectation E(Y) of the response variable Y through a 
link function g(), such as 


g(E(Y)) = LP =a + Xp (1) 


where X is a matrix of p column vectors (x1, x2, . . . , 
xy], is the constant term to be estimated, and = {B1, 
B2,..., Bp} is the row vector of p coefficients to esti- 
mate for the predictor variables. X denotes the matrix 
product, so that the ith element of g(E(Y)) is 


a + Bixi + Boxi2 iP D GALS BpXip 


Unlike classical linear models, which presuppose a 
normal distribution and an identity link, the distribu- 
tion of Y may be any of the exponential family distri- 
butions and the link function may be any monotonic 
differentiable function, like logit (i.e., exp(LP)/ 
[1 + exp(LP)]) for binomial models, logarithm for 
Poisson models, or unity for Gaussian models. 

All models were fitted in S-Plus (Mathsoft). Ordinal 
models were fitted by using a Proportional Odds (PO) 
model (Irm function, Harrell 1999; see Guisan and 
Harrell 2000 for the mathematical rationale and 
Guisan et al. 1998 for an ecological application). Pois- 
son, Gaussian, and binomial models were fitted by 
using the glm function and specifying a correct proba- 
bility distribution and link function. Final models 
were fitted and evaluated by using custom S-Plus func- 
tions (one for each type of model), which additionally 
implemented the model in the GIS GRID calculator by 
automatically generating a custom AML (Arc Macro 
Language; see Guisan et al. 1999). 
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For two species in the Belalp study area, (1) a GLM 
with a Poisson distribution and a logarithmic link, 
and (2) a GLM with a Gaussian distribution and an 
identity link, were additionally fitted, with the ordinal 
abundance as the response, in order to compare the 
predictions from these two alternative models to those 
from ordinal models. In both study areas, simple pres- 
ence-absence GLMs with a binomial distribution and 
a logistic link (see Nicholls 1989; Guisan et al. 1999) 
were fitted for the same species, in addition to ordinal 
models, to evaluate the capacity of ordinal models to 
predict at least presence-absence correctly. 


Evaluating the Predictions 


Evaluating a model is a critical task in the overall 
process of model building (see Fielding, Chapter 21). 
It is fundamental for assessing the quality of the model 
and the range of situations in which the model can 
properly be used (called model applicability in Guisan 
and Zimmermann 2000). However, for the same type 
of response variable, several measures and procedures 
are often proposed to evaluate a model, as shown by 
Fielding and Bell (1997), in the case of a binary pres- 
ence-absence response. The assessment of model qual- 
ity and applicability primarily depends, however, upon 
the choice of a measure that will correctly evaluate the 
model in the specific context of the study's objectives. 
The focus of this chapter is, however, on comparing 
models. Hence, I will discuss the accuracy of model 
predictions in a general context without assuming dif- 
ferent weights for omission and commission errors as 
would be the case if conservation objectives were the 
focus (see Fielding, Chapter 21, for more information 
on cost assessments). 

Evaluation of model accuracy was carried out dif- 
ferently for the ordinal and binomial models. Because 
ordinal predictions are on a continuous scale from 0 
to 5 (sum of probabilities), they should be trans- 
formed into discrete classes in order to compare them 
to any observations. For this purpose, I used the 
threshold calibration method described in Guisan et 
al. (1998), which consists of choosing the threshold 
providing the best agreement on the training data set 
and using it to evaluate the model on the evaluation 
data set. The calibrated threshold is dependent on the 
measure of agreement used. Thus, several thresholds 


are calibrated if several agreement measures are used. 
Ordinal predictions were evaluated with Goodman 
and Kruskal's (1954) y (Gamma) and Somers' (1962) 
dyx, two measures of agreement for comparing ordinal 
variables as discussed in Gonzalez and Nelson (1996) 
and used by Guisan (1997) and Guisan et al. (1998) in 
the context of static distribution modeling. Ordinal 
predictions were also reduced to presence-absence in 
order to compare them with presence-absence predic- 
tions made by logistic binomial models, for the same 
subset of species. 

In the case of presence-absence models, Cohen's 
(1960) Kappa provides an overall measure of accuracy 
(i.e., it makes full use of the information in the confu- 
sion matrix; Fielding and Bell 1997), although it as- 
similates omission and commission errors and was 
sometimes considered to be sensitive to prevalence 
(i.e., unequal group sizes, in this case between pres- 
ences and absences), although Manel et al. (in press) 
did not find any evidence of this. A measure that 
seems to be less sensitive to prevalence is the normal- 
ized mutual index (NMI) introduced by Forbes 
(1995), although it cannot be calculated if any cell of 
the confusion matrix is zero, for instance, when one of 
the two possible error rates (omission or commission) 
is null (which is exactly what one would expect to ob- 
tain in this particular modeling context). Both meas- 
ures test the model predictions for being different 
from chance performance, but they are also both 
dependent upon the choice of a threshold to cut the 
predicted probabilities (on a scale from 0 to 1) into 
presence/absence. For these reasons, Fielding and Bell 
(1997) suggest using the threshold-independent re- 
ceiver operating characteristic (ROC) method for eval- 
uating model predictions. The ROC method consists 
in plotting sensitivity (i.e., the true positive rate) 
against [1 — specificity] (i.e., the false positive rate) and 
eventually integrating the area under the curve, which 
can take values between 0.5 (1:1 line; agreement no 
different from that obtained by chance) and 1.0 (per- 
fect agreement). Here, I give the correct classification 
rate, the sensitivity and specificity error rates, Kappa, 
and the ROC plot for comparing ordinal and binomial 
models (see Fig. 25.5). 


Results 


Ordinal models were successfully adjusted for several 
species in both study areas. Their ability to predict or- 
dinal species cover density was compared to that of 
Poisson and Gaussian GLMs respectively (Table 25.2). 
All three types of model were fitted, for the same 
species, with the same subset of predictors. As shown 
by quantile-quantile plots (QQ-Plots) for the Poisson 
and Gaussian models and for the same two species, 
residuals are not distributed on the 1:1 line as would 
be the case if the postulates regarding the distribution 
of the response variable were respected (Fig. 25.2). For 
both species, all predictors explained a significant pro- 
portion of the deviance and were thus truly selected in 
each of the compared models. The proportion of de- 
viance explained, adjusted by the number of degrees of 
freedom, was similar for ordinal and Poisson models 
(0.54 and 0.59 respectively) but much lower for the 
Gaussian model (0.22), although these measures are 
difficult to compare between GLM fitted with different 
distributions. For the two species given in the example, 
ordinal models failed to predict the maximum possible 
class of five on the training data set but were able to 
predict the maximum possible class on the evaluation 
data set. In turn, the Poisson model was able to predict 
the maximum cover density class on the training data 
set in the case of Carex curvula, although it did not 
predict classes higher than 3 on the evaluation data 
set. However, the Poisson model predicted over-high 
values at other locations in the study area (up to 22 for 
C. curvula; see Fig. 25.3 and 25.4), which correspond 
to impossible density classes (5 already represents 
50-100-percent ground cover). The Gaussian model 
behaved inadequately in both cases, predicting nega- 
tive values (down to —6) at many locations and being 
unable either to fit or to predict the maximum values 
on both the calibration and evaluation data sets (max- 
imum of three predicted in both cases). 


Ordinal Versus Presence-Absence Models 


The evaluation of ordinal models was based on the y 
and d,, statistics. Their ability to predict at least pres- 
ence/absence, using the K statistics, was assessed by 
comparing their predictions, recoded into presence/ab- 
sence, to those of binomial models. 
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TABLE 25.2. 


Comparison of ordinal, Poisson, and Gaussian GLMs for two 
alpine species in Belalp. 


Model? Ordinal Gaussian Poisson 
Carex curvula 

amt + amt^2 « .0001 < .0001 « .0001 
slo 0.0003 0.0010 < .0001 
sc97 0.0080 < .0001 0.0024 
D2b 0.54 0.22 0.59 
max-fitted (5)c.d 4 2 4 
min-fitted (O)¢.4 | O -1 0 
max-pred (4)%¢ 4 D 3 
min-pred (0)4¢ O -1 (0) 
max-pred-area 5 S 22 
min-pred-area (0) -5 0 
Trifolium alpinum 

amt + amt^2 < .0001 < .0001 < .0001 
slo^2 0.0064 0.0001 0.0355 
tp500 0.0002 0.0348 0.0009 
cir2^2 0.0001 0.0014 « .0001 
cir3^2 0.0012 0.0019 « .0001 
D25 0.52 0.25 0.52 
max-fitted (5) +>°,4 4 2 


5 
min-fitted (0)¢4 0 0 
max-pred. (5) %e 5 7 
min-pred (0)4¢ 0 1 0 
max-pred-area 5 7 
min-pred-area 0 0 


aFor each model, the p-value of the deviance reduction chi test is first 
given for each predictor. 

bD2 = overall proportion of deviance explained by each model, 
adjusted by the number of degrees of freedom used. 

*max-fitted and min-fitted = highest and lowest values fitted by the 
model on the training data set. 

dThe maximum possible value (i.e., observed) is given in brackets. 
emax-pred and min-pred = highest and lowest values predicted on the 
evaluation data set. 

Note: Max-pred-area and min-pred-area are respectively the highest 
and lowest values predicted over the whole study area. Min-fitted, min- 
pred and min-pred-area thus indicate whether negative values were 
fitted or predicted by the model at any location in the study area 
(usually outside the range of the calibration and evaluation data sets). 
See Figures 25.2 and 25.3 for visualizing the extent of pixels where 
negative values are predicted by the Gaussian models. 


In Table 25.3, the confusion matrices for Nardus 
stricta, using (a) the calibration and (b) the evaluation 
data set, show an example of ordinal model evalua- 
tion. This model provides a maximum value of 0.82 
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Pearson residuals 
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Figure 25.2. Quantile-quantile plots (QQ-Plots) of standard Pearson regression residuals for (a) the Poisson GLM (generalized lin- 
ear model) of Carex curvula, (b) the Gaussian GLM of Carex curvula, (c) the Poisson GLM of Trifolium alpinum, (d) the Gaussian 


GLM of Trifolium alpinum. 


for y at the calibration (Tables 25.3 and 25.4), which 
corresponds to cutting the continuous predictions 
(summed probabilities) at multiple thresholds (pt) of 
0.15, 1.15, 2.15, 3.15, and 4.15 (i.e., every successive 
integer.15, hereafter written [I].15). Recoding the pre- 
dictions to presence/absence (0 stays 0; [1:5] becomes 
1) provides a value of K of 0.61 for the same model. 
Thresholds at every [I].85 are obtained instead of 
every [I].15 (pt in Table 25.3) if dy, is optimized in- 


stead of y, corresponding to a maximum value of d 
of 0.691 and an associated K of 0.767 (Table 25.3). 
Hence, using the threshold providing the best dy, 
rather than the best y provided a better prediction of 
presence/absence for this species, as measured by K, 
although the reverse was true for other species (Table 
25.4). | 
Overall, the evaluation of ordinal models can be 
considered fair to good (Table 25.4), as all values of y 


TABLE 25.3. 
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Comparing predictions to observations in the case of the Nardus stricta model. Ordinal confusion matrices and related 
optimal y (Gamma) and Somer's dy, agreement measures for ordinal variables. Tables on the left are obtained by 
using the threshold providing the best y (see Guisan et al. 1998) and, on the right, the threshold providing the best 
value for Somer's dy. Ordinal matrices are further reduced to presence-absence tables, and related K (Kappa) 
measure of accuracy, to check the ability of ordinal models to predict at least presence-absence correctly. 


(a) Calibration (N = 205) 


Goodman and Kruskal's y Somer's dy, 
D 1 2 3 4 5 0 1 2 3 4 5 
0 46 Al 0 0 íl (0) 0 62 2 3l T aL 0 
1 20 2 2 al 0 0 al 10 12 3 2 2 (0) 
2 6 12 2 2 4 1 2 4 6 3 1 9 3 
3 5 5 3 2 8 5 3. 3 5 6 5 20 19 
4 2 5 T y 30 22 4 0 O al 4 Ta 9 
5 0 0 0 1 (0) 3 5 (0) 0 O 0 0 
0 1 pe e (lis: 0 T pt= 0835 
y = 0.819 dy, = 0.691 
K= 0.611 K = 0.767 
46 2 0 62 5 
33 124 1 17 Dad 
(b) Evaluation (N = 92) 
Goodman and Kruskal's y Somer's dy, 
0 al 2 3 4 5 0 i 4 2 4 5 
0 1 1 0 O 0 (0) 0 19 2 0 al 1 0 
1 7 3 0 al 1 0 4 9 6 1 0 3 1 
2 oM ores. © 4 101 2 2 5 3 0 1 0 
3 al 2 al S 2 ab 3 1 3 3 3 10 3 
4 D 6 3 0 T1 T 4 2 4 O (0) 3 4 
5 1 1 (0) 0 0 dl 5 0 0 O 0 6) 
0 1 pt = 0.15 0 1 pt= 0.85 
y = 0.644 dy, = 0.531 
K = 0.432 K = 0.544 
13 1 19 4 
20 58 14 55 


apt is the probability threshold. 


measured on the independent evaluation data set (held 
back data) range between 0.64 and 0.94 and between 
0.53 and 0.87 for dy, (the latter always provides a 
value lower than y when applied on the same confu- 
sion matrix; see Gonzalez and Nelson 1996). How- 
ever, y and dy, provide different patterns of change 
when predictions are cut at successive 0.01 probability 


thresholds (Fig. 25.5). As a result, the maximum value 
of y —max(y)—does not provide the same probability 
threshold as the one provided by max (d,,). In 
this respect, results suggest (Table 25.4) that, although 
optimizing y at the calibration seems to provide bet- 
ter presence/absence predictions (i.e., K,) than 
does optimizing dy, (ie., Kp), (for Cercocarpus 
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TABLE 25.4. 
Comparison of ordinal and binomial (p/a) models for three species in Belalp (Swiss Alps) and six in the Spring Mountains (Nevada). 


Calibration data set.».¢ Evaluation data set@.».¢ 


Species D2? D2, y K, dyx Kdyx Kg y K, dyx Kd, Kg 
Belalp 

Carex curvula 0.542 0.576 0.899 0.592 0.821 0.592 0.769 0.717 0.415 0.561 0.415 0.490 
Nardus stricta 0.633 0.575 0.819 0.611 0.691 0.767 0.788 0.644 0.432 0.531 0.544 0.516 
Trifolium alinum 0.560 0.437 0.825 0.681 0.686 0.681 0.685 0.716- 0572059088075 04128015727 
Spring Mountains 

Cercocarpus ledifolius 0.542 0.489 0.901 0.649 0.744 0.460 0.658 0.918 0.670 0.707 0.319 0.589 
Cleogyne ramosissima 0.694 0.812 0.952 0.889 0.899 0.825 0.903 0.908 0.735 0.790 0.627 0.768 
Ephedra viridis 0.658 0.531 0.984 0.699 0.717 0.738 0.739 0.954 0.716 0.701 0.746 0.745 
Pinus flexilis 0.535 0.548 0.963 0.684 0.869 0.579 0.699 0.938 0.619 0.851 0.590 0.619 
Pinus longaeva 0.727 0.716 0.961 0.788 0.925 0.788 0.877 0.921 0.696 0.864 0.696 0.745 
Pinus ponderosa 0.554 0.596 0.944 0.799 0.823 0.496 0.799 0.927 0.640 0.874 0.365 0.703 


aD? = percentage of deviance explained, y = Gamma, d,, = Somer's dy, K = Kappa. 
For indices : O = ordinal model, B = binomial presence/absence model. 
°K, and Kdy, = Kappa calculated on the ordinal confusion matrix recoded into presence/absence using the threshold providing respectively the 


optimal y and the optimal dy,. See main text for details. 


ledi-folius, Coleogyne ramosissima, Pinus flexilis, and 
P. ponderosa) than the opposite situation (for N. 
stricta and Ephedra viridis only), an equivalent quality 
of predictions can also be obtained in a certain num- 
ber of cases (for C. curvula, T. alpinum, and P. lon- 
gaeva). At the evaluation, where optimal thresholds 
are used to assess model accuracy on the independent 
data set, the latter category includes C. curvula and P. 
longaeva but not T. alpinum for which a same value of 
K was actually obtained from two different thresholds 
(left part of Table 25.4). 

Ordinal and binomial models explain, overall, a 
similar proportion of deviance (D2) (Table 25.4). 
However, ordinal models explain a greater proportion 
of deviance than binomial models in the case of N. 
stricta, T. alpinum, C. ledifolius, E. viridis, and P. lon- 
gaeva, and less in the case of C. curvula, C. ramosis- 
sima, Pinus flexilis, and P. ponderosa. 

The evaluation of the ordinal and binomial models 
of C. curvula was assessed graphically (Fig. 25.5) by 
displaying the change of six measures of accuracy 
(four binary: correct classification rate, sensitivity, 
specificity, Kappa, and two ordinal: and dy; Figs. 
25.5a and 25.5c) obtained by cutting the probability 
at successive threshold values (using here a 0.01 incre- 


ment) between 0 and 1. The ordinal model appears 
less threshold-sensitive than the binomial model in 
this case (flatter curves; Fig. 25.5a). Fig. 25.5b and 
25.5d show the ROC plots for the ordinal and bino- 
mial models respectively. Curves on both plots have 
similar shapes, although points on the curves for the 
ordinal model are more concentrated toward higher 
values of sensitivity and more toward lower values for 
the binomial model (especially in the case of the eval- 
uation curve). Thus, overall, both models provide sim- 
ilar accuracy when predicting presence/absence (i.e., 
the area under the curve is visually approximately the 
same). 

Finally, comparing K-values of ordinal models (Ky 
and Kp) to K-values of binomial models (K„), as well 
as to the ROC plots, on the evaluation data set shows 
that (1) ordinal models provide slightly better pres- 
ence/absence predictions when the threshold is chosen 
by optimizing y rather than dy, (Ky is always higher 
than Kp, except in the case of N. stricta and E. 
viridis), and (2) ordinal models predict presence/ab- 
sence as well as binomial models (maximum absolute 
difference between K, and Kg of 0.084; see Table 
25.4) but are additionally able to predict density cover 
of plant species satisfactorily. 


Ordinal model 
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Binomial model 
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Figure 25.5. Graphical evaluation of Carex curvula ordinal and binomial models: (a) Values of various accuracy measures at differ- 
ent probability thresholds (every 0.05 unit between O and 1) used for cutting predictions from the training data set into ordinal (y, 
dy) or presence/absence (ccr, sensitivity, specificity, Kappa) values, to compare them to observed values on the same scale; (b) 
ROC plot for the ordinal model build with ninety-nine thresholds between O and 1 (see text); (c) p/a accuracy measures for the bi- 
nomial model; (d) ROC plot for the binomial model. Prevalence—the proportion of presence compared to the total of observations— 


is here 0.229. ccr = correct classification rate. 


Discussion 


My results show that it is worthwhile using appropriate 
ordinal regression models when the response variable is 
supposedly ordinal in nature or when a continuous 
quantitative response variable is sampled in an ordinal 
fashion (e.g., considering intervals of values). Using a 
Gaussian model (in our case without log-linearizing the 
response) proved particularly unsuitable, as this model 
predicts negative values down to —6, and classes below 
zero are biologically meaningless. In addition, the 
Gaussian model was systematically unable to predict 
the highest class or even the next highest. The Poisson 
model proved better in this respect as it was able to pre- 


dict the highest class and cannot predict negative val- 
ues. The proportion of deviance explained by the Pois- 
son models corresponded closely to that of the ordinal 
models. However, in a Poisson GLM, the predictions 
are not constrained to fit the range of possible values 
for the outcome (although this might perhaps be 
done), as they are in ordinal (see Guisan et al. 1998; 
Guisan and Harrell 2000) and binomial models (due to 
the inverse logit [or probit] link function, which can 
only predict values in the range 0 to 1; see Guisan et al. 
1999). As a result, values of up to 22 were predicted by 
the Poisson model for C. curvula, even though the high- 
est possible ordinal class is 5, corresponding to a 
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ground-cover density of 50 to 100 percent. The proce- 
dure for evaluating ordinal models bears the same in- 
trinsic limitations as for evaluating other types of mod- 
els, as reviewed for presence/absence binary models by 
Fielding and Bell (1997). Although I used Gamma and 
Somers’ d,, for evaluating ordinal models, other meas- 
ures are proposed in the literature, such as Kim's dyy, 
Wilson's dy, or Kendall's tau (see Gonzalez and Nelson 
1996 for more details on these measures), which might 
eventually be more appropriate in other situations. 
With the exception of Kendall's tau, a threshold needs 
to be chosen with all these measures of agreement to 
compare continuous predictions that are sum of proba- 
bilities to discrete ordinal observations. As stated by 
Fielding and Bell (1997), the problem with such thresh- 
old-dependent measures of accuracy is their failure to 
use all the information predicted by the model. Unfor- 
tunately, the ROC plot method they recommend for 
presence/absence models is hardly applicable to ordinal 
predictions since there are no equivalent measures to 
sensitivity and specificity in the case of ordinal confu- 
sion matrices. 

Hence, I propose using a measure of ordinal agree- 
ment (e.g, Y, dy,) conjointly with associated 
K-values obtained by recoding ordinal data into 
presence/absence ([0] 0/[1 to 5] 1) at successive proba- 
bility thresholds and associated ROC plot, because 
this allows one to check the ability of the ordinal 
model to predict at least presence/absence satisfacto- 
rily, as shown by the models presented in this study. 
Finally, the implementation of ordinal models into a 
GIS proved successful and not too complex. 

Ordinal response regression models applied to 
mimic the spatial distribution of plant species can eas- 
ily be extended to simulate animal distributions when 
the response is measured on a semiquantitative scale 
(e.g., visual abundance classes in entomology). Recod- 
ing such data into presence/absence, as often seen in 
literature to fit a logistic regression model, necessarily 
involves losing ecological information that may be of 
primary importance in some applications (e.g., conser- 
vation studies). In plant ecology, predicting plant as- 
semblages can hardly be made from presence/absence 


data alone. Here, simulating the cover density of (at 
least) each dominant species is necessary to yield real- 
istic predictions of plant communities. 

Recommendations for future research in modeling 
species occurrence include according greater consider- 
ation to the probability distribution of response vari- 
ables and choosing an appropriate statistical model 
accordingly. This is particularly important if models 
are to be used in a conservation perspective, for exam- 
ple, for suggesting important areas to protect based on 
their modeled species pool. GLMs are very effective in 
this respect, because they allow for consideration of 
nearly all possible distributions (Guisan and Zimmer- 
mann 2000) from qualitative (multinomial) to quanti- 
tative (Gaussian, Poisson, negative-binomial, bino- 
mial, etc.), with intermediate semiquantitative (i.e., 
ordinal) and, as a special case (i.e., either quantitative 
or qualitative), binary distributions (treated as bino- 
mial; see Guisan et al. 1999). Finally, more detailed 
studies on the evaluation of such ordinal models 
should be encouraged. 
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Patch-based Models to Predict 
Species Üccurrence: Lessons from 
Salmonid Fishes in Streams 


Jason B. Dunbam, Bruce E. Rieman, and James T. Peterson 


E o heterogeneity often produces patchy 
or discontinuous distributions of organisms. Even 
broadly distributed species show localized peaks of 
abundance (Maurer 1999). This is particularly obvi- 
ous in stream ecosystems, where patch dynamics is a 
dominant theme (Pringle et al. 1988). Features of the 
environment that may influence species occurrence in 
streams are believed to result from a hierarchy of 
physical processes operating within drainage basins. 
This idea has formed the basis of several classification 
schemes for stream habitats (e.g., Frissell et al. 1986; 
Hawkins et al. 1993; Imhof et al. 1996; Naiman 
1998; see Morrison and Hall (Chapter 2) for defini- 
tion of *habitat"). These classifications provide a use- 
ful framework for understanding physical processes 
that generate stream habitat over areas of varying size 
and spatial resolution, but they do not explicitly con- 
sider how individual species actually perceive or uti- 
lize these patchy environments. 

To be most useful, patches should be clearly defined 
by associations between a biological response (e.g., re- 
production, migration, feeding) and environmental 
variability (Addicott et al. 1987; Kotliar and Wiens 
1990). Classifications of aquatic habitat based purely 
on physical characteristics or subdivision of water- 
sheds into arbitrary segments may not adequately de- 
scribe *patchiness" from an organism's point of view. 
Lack of attention to realistic scaling of environmental 


variation and biological responses can produce weak 
or misleading inferences (Goodwin and Fahrig 1998). 
Our focus in this chapter is on definition of patches 
suitable for supporting local breeding populations. 
This is a key prerequisite for applying ideas from 
metapopulation and landscape ecology to predicting 
species occurrence. 

Here, we review our attempts to develop patch- 
based classifications of aquatic habitat and models to 
predict occurrence of salmonid fishes in streams. We 
begin with a brief overview of the concept of patchi- 
ness. Next, we outline criteria to define the biological 
response of interest: occurrence of local populations. 
We then describe models to predict the distribution of 
local populations within stream basins. These models 
allow delineation of patches of suitable habitat within 
watersheds and definition of patch structuring. Pat- 
terns of patch structuring and characteristics of indi- 
vidual patches provide the basis for modeling occur- 
rence of local populations. We compare patch-based 
models of occurrence for two threatened salmonids: 
bull trout (Salvelinus confluentus) and Lahontan cut- 
throat trout (Oncorhynchus clarki henshawi). Finally, 
we compare our results to alternative approaches to 
predict occurrence of salmonids and discuss implica- 
tions of a patch-based approach that should be 
generally relevant for developing models of species 


occurrence. 
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The Concept of Patchiness 


The term “patch” has been applied in numerous con- 
texts in ecology (e.g., Pickett and White 1985; McCoy 
and Bell 1991; Pickett and Rogers 1997; Morrison 
and Hall, Chapter 2). Our definition of patches paral- 
lels the concept of ecological neighborhoods intro- 
duced by Addicott et al. (1987). Ecological neighbor- 
hoods are defined by a specific biological response and 
not by an arbitrary temporal or spatial scale or by a 
perceived boundary or control imposed on the system. 
A “patch” corresponds to limits or boundaries of en- 
vironmental conditions that can support a biological 
response. Patches of environmental conditions poten- 
tially suitable to support local populations of a species 
are often the focus in landscape and metapopulation 
ecology. 

Kotliar and Wiens (1990) provided a general 
framework for defining patch structure. Given that a 
biological response is observed to occur within a de- 
finable spatial frame, patch structure can be character- 
ized by (1) the degree to which patches can be distin- 
guished from each other and the surrounding 
environment (patch contrast), and (2) how patches are 
spatially aggregated. Patch structuring may be charac- 
terized by a nested or hierarchical pattern and may 
vary widely among biological responses. l 

Patch structuring is not directly synonymous with a 
specific temporal or spatial scale. For example, 
patches defined here may vary by an order of magni- 
tude or more in size (patch area). It is definition of 
common biological responses and environmental crite- 
ria for determining patch structure, not spatial or tem- 
poral scale per se, that provides a foundation for 
patch-based models of species occurrence that may be 
generalized within and among species. 


Defining a Biological Response 


We were interested in predicting occurrence of fish in 
patches of habitat suitable for local breeding popula- 
tions. Patches suitable for local populations should 
correspond to locations where population growth can 
be attributed primarily to in situ reproduction, rather 
than immigration (Addicott et al. 1987). Limited de- 
mographic interaction among local populations im- 


plies some degree of reproductive (genetic) isolation. 
Spatial isolation of spawning and rearing habitat for 
salmonids is reinforced by strong natal homing 
(Quinn 1993), and patches of suitable habitat may 
therefore support relatively discrete local populations. 
Ultimately, it would be desirable to use multiple 
sources of information to delineate local breeding 
populations. Several studies have demonstrated the 
limitations of using limited genetic or demographic in- 
formation alone to infer population structuring (Ims 
and Yoccoz 1997; see Utter et al. 1992, 1993 for 
salmonid examples). 

Unfortunately, detailed genetic and demographic 
data are not available for most systems. For salm- 
onids, we have defined patches of suitable habitat by 
modeling the distribution limits of smaller, presum- 
ably *premigratory" or resident individuals within 
streams. Larger juvenile and adult salmonids may 
adopt migratory life histories (Northcote 1997, Fig. 
26.1) and range far outside of spawning and rearing 
areas, but their existence ultimately depends on 
spawning and rearing habitat. Our delineation of 
patches for salmonids, then, is based principally on 
ecological information and an assumption of natal 
homing. Population genetic analysis (e.g., Kanda 
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Figure 26.1. Simplified schematic of life-history variation in 
salmonid fishes. Fish with a “resident” life history spend their 
entire lives within spawning and rearing areas. Migratory fish 
use habitats outside of spawning and rearing areas but return 
faithfully (homing) to breed in natal areas. Some dispersal is 
possible among both life-history types. Our definition of 
patches corresponds to the extent of spawning and rearing. 


1998; Spruell et al. 1999) for some systems indicates 
genetic divergence does correspond to juvenile distri- 
butions. Our approach to defining patches appears to 
be a reasonable approximation, but detailed demo- 
graphic and/or genetic data will be necessary to con- 
firm the structure of any system (Haila et al. 1993; 
Rieman and Dunham 2000). 
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Models of Distribution Limits 
and Patch Delineation 


Unlike terrestrial habitats, streams are generally 
viewed as one-dimensional systems in terms of fish 
distributions and dispersal. Therefore, boundaries of 
habitat patches may be delineated in an up- and/or 
downstream direction. Many factors can potentially 
limit the distribution of spawning and rearing habitat 
for salmonids, including natural and artificial disper- 
sal barriers, water temperature, interactions with non- 
native salmonids and other fishes, human disturbance, 
and geomorphic influences. These factors are often 
not independent. For example, interspecific interac- 
tions mediated by water temperature may influence 
longitudinal distributions of species within streams 
(De Staso and Rahel 1994; Taniguchi et al. 1998). 

In the case of bull trout and cutthroat trout, spawn- 
ing and early rearing usually occur in upstream or 
headwater habitats (often fourth-order streams or 
smaller), so we were particularly interested in factors 
that determine downstream distribution limits of juve- 
niles. Our two study areas are located at the southern 
margin of the range for both species, where unsuitably 
warm summer water temperatures in streams are 
probably an important factor limiting the amount of 
suitable habitat (Rieman et al. 1997; J. B. Dunham 
and B. E. Rieman unpublished data). Local popula- 
tions of these species in other areas may be delineated 
by different habitat characteristics, such as availability 
of high-quality spawning habitat (Baxter et al. 1999; 
also see Geist and Dauble 1998), barriers (e.g., dams, 
waterfalls, subsurface flow), and sharp transitions in 
habitat that occur as tributary streams flow into larger 
streams or lakes. 

There are many recent examples of attempts to 
classify aquatic habitat for salmonids based on differ- 
ent indicators related to variability in stream tempera- 
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Figure 26.2. Map of study areas: the upper Boise River Basin, 
Idaho, and the eastern Lahontan Basin, Nevada. 


ture. Various researchers have classified thermally suit- 
able habitat from variation in groundwater (Meisner 
1990; Nakano et al. 1996), air (Keleher and Rahel 
1996), and surface water temperatures (Eaton et al. 
1995; Rahel et al. 1996). Our approach is currently 
based on modeling elevation gradients, which are cor- 
related with temperature (Keleher and Rahel 1996). 
Our attempts to delineate the amount and distribution 
of suitable habitat (i.e., patch structure) for salmonids 
have relied on empirical relationships between down- 
stream distribution limits of juveniles and elevation or 
geographic gradients (see also Flebbe 1994). Our 
work has been with Lahontan cutthroat trout in the 
eastern Lahontan Basin in southeast Oregon and 
northern Nevada, and bull trout in the upper Boise 
River Basin in southern Idaho (Fig. 26.2). 

Delineation of patches for bull trout in the Boise 
River Basin relied on information from surveys of ju- 
venile distributions, which suggested a sharp increase 
in occurrence above an elevation of 1,600 meters (Rie- 
man and McIntyre 1995; Dunham and Rieman 1999). 
This distribution limit was used to delineate the 
amount and distribution of suitable habitat patches 
within the basin. In the case of Lahontan cutthroat 
trout, a geographic model was necessary to account 
for changes in the elevation of distribution limits over 
the eastern Lahontan Basin, which covers a much 
broader area (Dunham et al. 1999). Geographic 
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location (latitude and longitude) explained over 70 per- 
cent of the variation in the elevation of downstream dis- 
tribution limits for Lahontan cutthroat trout. 

Patch delineation involved linking models of down- 
stream distribution limits with a geographic informa- 
tion system (GIS). Predicted downstream distribution 
limits were used to delineate the size and distribution 
of watersheds with suitable habitat. We defined 
patches of suitable habitat as the watershed area up- 
stream of predicted elevations for downstream distri- 
bution limits. Defining patches in terms of watershed 
area is consistent with the view that watershed charac- 
teristics have an important influence on stream habi- 
tats (Montgomery and Buffington 1998). Local or re- 
gional variation in watershed characteristics may have 
an important influence on the development of stream 
channels and aquatic habitat (Burt 1992), and patch 
structuring and patterns of species occurrence may 
vary accordingly. 

An alternative, and perhaps more precise, measure 
of patch size would be actual length of stream occu- 
pied within a watershed. Length of stream occupied 
requires information on both up- and downstream 
distribution limits, whereas watershed area requires 
only information on downstream distribution limits. 
Stream length might be important where there is 
strong local variability in climate and geomorphology, 
or when barriers to fish movement within streams 
limit upstream distributions. If barriers are important, 
fish may only be able to occupy a very limited amount 
of habitat, and patch sizes estimated by stream length 
and watershed area could differ substantially. Limited 
evidence suggests the influence of barriers on fish dis- 
tributions within streams is generally minimal, though 
important exceptions do exist (e.g., Kruse et al. 1997; 
Dunham et al. 1999). 

Another potentially important localized factor is 
occurrence of nonnative trout. In the case of Lahontan 
cutthroat trout, for example, downstream distribution 
limits were significantly restricted when nonnative 
trout were present (Dunham et al. 1999). This effect 
was not consistent or predictable, so we could not 
simply account for the effect of nonnative trout in 
defining distribution limits and patch sizes. Earlier 
models of occurrence of Lahontan cutthroat trout did 
not detect an effect of nonnative trout (Dunham et al. 


1997), but this study did not provide clear definition 
of patch structure. 

Because localized factors within streams (e.g., geo- 
morphic features, nonnative fish) may place con- 
straints on the amount of habitat that can be occu- 
pied, patch areas may not reflect the "effective" size of 
habitat available to fish. To remedy this potential 
problem, we examine model interaction terms and 
prediction errors for streams with and without known 
constraints. The alternative is to directly map local 
features of stream habitats across large areas to delin- 
eate patches, which is often difficult to justify with 
limited resources. 


Modeling Species Occurrence 


Delineation of patches and patch structuring within 
drainage networks provides a template for predicting 
species occurrence. Our approach is essentially a two- 
tiered model: (1) define distribution limits in terms of 
geographic or elevation gradients to delineate suitable 
habitat patches, (2) predict occurrence of fish within 
patches. To predict species occurrence, we have fo- 
cused on influences of the geometry of patches within 
landscapes, namely the size, isolation, and spatial dis- 
tribution of patches (see Rieman and Dunham 2000). 
Here, we focus on patch size. Patch size may be related 
to fish occurrence because habitats in larger patches 
may be more complex and resilient to disturbance and 
should generally support larger populations. 

Rieman and McIntyre (1995) used multiple logistic 
regression to model occurrence of bull trout in the 
Boise River Basin and found patch area to be the 
strongest predictor. Other significant factors included 
patch isolation and road density within patches (Dun- 
ham and Rieman 1999). Solar radiation and occur- 
rence of nonnative brook trout were not associated 
with occurrence, and patterns of occurrence were not 
spatially aggregated (Dunham and Rieman 1999). Lo- 
gistic regression of occurrence of Lahontan cutthroat 
trout in relation to patch area revealed a highly signif- 
icant (P « 0.0001) and positive relationship (J. B. 
Dunham unpublished data). 

Although analyses of data for Lahontan cutthroat 
trout are preliminary, some interesting common 
themes are suggested by the results. For both species, 
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Figure 26.3. Comparison of patch size distributions for bull 
trout (Salvelinus confluentus) in the upper Boise River Basin 
and Lahontan cutthroat trout (Oncorhynchus clarki henshawi) in 
the eastern Lahontan Basin. 
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patch area appears to be a significant correlate of 
species occurrence. This is a common pattern emerg- 
ing for many species in both terrestrial and aquatic 


ecosystems (Bender et al. 1998; Moilanen and Hanski : 


1998; Magnuson et al. 1998; Hanski 1999) and a gen- 
eral prediction from island biogeography and 
metapopulation theory (Hanski and Simberloff 1997). 
More interesting are the details of the relationship be- 
tween patch size and species occurrence. 

First, an examination of patch size distributions re- 
veals that size distributions are remarkably similar for 
both species and skewed toward very small patches 
(Fig. 26.3). This means that very few patches are likely 
to have a high probability of occurrence and that a 
few large patches may be very important for both 
species. In terms of total area of potential habitat oc- 
cupied, bull trout in the Boise River Basin occupy rel- 
atively more (46 percent) than Lahontan cutthroat 
trout in the eastern Lahontan Basin (36 percent). 
When the actual responses of both species to changes 
in patch size are compared (Fig. 26.4), it is clear that 
both species are likely to occur when patch sizes ex- 
ceed about 104 ha in area. Relative to Lahontan cut- 
throat trout, bull trout are more likely to occur in 
smaller patches (Fig. 26.4), which may explain why 
bull trout occupy a larger percentage of suitable habi- 
tat overall. 

Because a common definition for patches was used 
for both species, we were able to compare specific re- 
sponses of each to variability in patch size. In the fu- 
ture, analysis of occurrence in relation to other char- 
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Figure 26.4. Predicted probability of occurrence in relation to 
patch size (area) for bull trout (Salvelinus confluentus) in the 
upper Boise River Basin and Lahontan cutthroat trout (On- 
corhynchus clarki henshawi) in the eastern Lahontan Basin. 


acteristics of patches may reveal additional insights. 
Although the biology of these two species differs in 
important ways (Rieman and Dunham 2000), these 
results provide general themes to guide efforts to con- 
serve and manage these species, along with important 
details relevant to particular species or environments 
they inhabit. 


Evaluating Model Prediction 


The most relevant measure of a classifier is its expected 
error rate (EER, Lachenbruch 1975). Among possible 
EER estimators, leave-one-out cross-validation is a 
nearly unbiased estimator of out-of-sample model per- 
formance (Fukunaga and Kessel 1971) that provides a 
measure of overall predictive ability without excessive 
variance (Efron 1983). Leave-one-out cross-validation 
involved removal of an individual observation from the 
data set, fitting a model with the remaining observa- 
tions, and predicting the omitted observation. Model 
probabilities of occurrence greater than or equal to 0.50 
were classified as predicted occurrences. Fits between 
observed and predicted occurrences were summarized 
as classification error rates, summarized over all obser- 
vations and by response. 

Because the results for Lahontan cutthroat trout 
were preliminary, we focused on classification (omis- 
sion) and prediction (commission) errors for the 
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model of bull trout occurrence reported by Dunham 
and Rieman (1999). Based on the overall classification 
error rate (Table 26.1), our logistic regression model 
was fairly accurate with a 19.7 percent error rate. This 
suggests a good fit between the data and our logistic 
regression model, but this simple measure of predic- 
tive ability does not reveal insights into potential bias 
or sources of error. When modeling species occur- 
rences, biological responses are usually approximated 
assuming some predefined statistical distribution. For 
example, logistic regression assumes a binomial re- 
sponse distribution and a logit link. Hence, model ac- 
curacy is likely a function of how faithfully the distri- 
bution approximates the biological response. To 
illustrate, we fit the bull trout occurrence data to a k- 
nearest-neighbor (KNN) model, a relatively flexible 
nonparametric classification technique that does not 
require distribution assumptions or a strong assump- 
tion implicit in specifying a link function (Hand 
1982). The overall KNN error rate was 17.3 percent, 
only slightly lower than the logistic regression model, 
which suggested a reasonable fit of the data to the as- 
sumed binomial distribution. Category-wise (i.e., re- 
sponse-specific) classification and prediction error 
rates can also be influenced by response-specific sam- 
ple size (Agresti 1990). In general, error rates are 
higher for less-frequent responses. For instance, bull 
trout occurrence had the lowest sample size and the 
highest prediction and classification error rates for 
both the logistic regression and KNN models (Table 
26.1). Additionally, logit model error rates are influ- 
enced by the choice of the baseline category (e.g., 
modeling presence or absence, Agresti 1990). 

Choice of statistical model may be important, but 


TABLE 26.1. 


Summary of leave-one-out cross-validation classification and 
prediction? error rates? for bull trout patch occupancy models. 


K*-nearest 
Patch status n Logistic regression neighbor 
Occupied 29 0.276 (0.276) 0.207 (0.258) 


Unoccupied 52 0.154 (0.154) 0.154 (0.120) 


aPrediction in parenthesis. 
5Omission and commission errors, respectively. 
ck = 12 nearest neighbors. 


error in determination of occurrence can also bias 
model predictions. For example, bull trout in smaller 
patches may also occur at lower densities, which may 
affect probability of detecting fish. Another important 
point is that predictions from the model are not pre- 
cise. Lower and upper confidence intervals for slope 
estimates of the patch area effect range from 48 to 61 
percent of the point estimate (see Dunham and Rie- 
man 1999). We do not have enough confidence in the 
precision of our models to believe that model *accu- 
racy” can be reasonably assessed by analysis of errors 
of omission or commission alone. In other words, 
there is a need for both statistical and biological *val- 
idation" of models. We suspect similar limitations 
apply to many other models of species occurrence. 


Implications for Models of 
Species Occurrence 


The efficacy of a patch-based approach depends on 
how clearly patches can be defined. Even if the defini- 
tion of patch structuring is clear, it may not be realis- 
tic to treat patches as independent of the landscape in 
which they are embedded. The degree to which the 
landscape “matrix” should be considered in models of 
species occurrence will depend on patch contrast, ag- 
gregation, and scale of study (Kotliar and Wiens 1990; 
Whens 1996a,b), as well as life history characteristics 
of the species in question. For example, many 
salmonids have complex migratory behaviors, and in- 
teractions between migratory behavior and habitat 
outside of spawning and rearing areas (e.g., Fig. 26.1) 
may affect species occurrence within patches (Rieman 
and Dunham 2000). 

A patch-based approach to modeling patterns of 
species occurrence has several important advantages. 
In the simplest sense, a patch-based approach permits 
stratification of models of species occurrence. In 
purely statistical terms, stratification may be a useful 
tactic to increase the precision of model predictions. 
Lack of consideration or knowledge of patch structure 
can produce mismatched inferences between patterns 
of occurrence and habitat characteristics. Many 
common approaches to subdividing landscapes into 
pixels, polygons, political boundaries, and so forth 
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Figure 26.5. Watersheds are often subdivided into hydrologic 
units (HUs) for classification and analysis of aquatic habitats 
(Maxwell et al. 1995). This overlay of sixth-field HUs (thin lines) 
and patches (heavy lines with shading) for bull trout (Salvelinus 
confluentus) in the Boise River Basin shows that patch and HU 
watershed boundaries can be substantially different. 


may not adequately reflect patch structuring (e.g., Fig. 
26.5). 


Patch Structure and Scale 


Patch structure can have important implications for 
finer-scale models of occurrence. Often, fish-habitat 
relationships for salmonids are considered within rel- 
atively small sites ranging from individual pools and 
riffles to stream segments (100—102-meter) (Fausch et 
al. 1988; Angermeier et al., Chapter 46). At these 
spatial scales, occurrence of fish among sites is proba- 
bly not independent, because they are nested within a 
larger area supporting a population that is influenced 
by larger-scale environmental variation and fish 
movement (Schlosser 1995; Gowan and Fausch 
1996b). 

At finer scales (e.g., stream segments within 
patches), the existence of suitable but unoccupied 
habitat is a possibility, especially when patch dynam- 
ics are characterized by extinction and/or (re)coloniza- 
tion (Rieman and Dunham 2000). This implies that 
absence of fish at sites may not be a function of site- 


scale habitat quality, but rather a characteristic of a 
larger patch (e.g., size, isolation), within which sites 
are nested. As hypothesized by Kotliar and Wiens 
(1990), larger-scale environmental variation may place 
important constraints on patterns nested at smaller 
scales. Nonspatial models of species occurrence im- 
plicitly assume organisms are free to select all habitats, 
but this is not true if external constraints (e.g., spatial 
isolation, dispersal barriers) or internal constraints 
(e.g., homing, philopatry) are important (see Rosen- 
berg and McKelvey 1999 for a recent example). 

At larger ecological scales, patches supporting local 
populations may be aggregated within landscapes, 
perhaps forming metapopulations or *semi-independ- 
ent networks" (Hanski 1999; Rieman and Dunham 
2000). For salmonids in streams, patterns of hydro- 
logic connectivity may produce spatially aggregated 
clusters or “networks” of patches. Whether defined as 
metapopulations or otherwise, aggregates of patches 
supporting local populations can be characterized by a 
number of larger-scale characteristics, such as number 
of patches, size distribution, isolation, land form, land 
type, or climatic associations. Models linking species 
occurrence to aggregate characteristics may be consid- 
ered in terms of occurrence of a single or multiple 
local populations. 


Management Applications 


One of the strongest motivations for better predictive 
models of species occurrence is the need for better 
data to support conservation planning for threatened 
and endangered species (the two species described 
here are listed as threatened under the U.S. Endan- 
gered Species Act). Key steps in conservation planning 
include (1) delineation of units for conservation, (2) 
risk or status assessment of the units, and (3) prioriti- 
zation of management and species recovery actions. 
Delineation of biotic units and selection of appropri- 
ate biological responses and/or criteria have been the 
focus of much debate, especially for salmonid fishes 
(e.g., Nielsen 1995; see also Paetkau 1999). Patch- 
based models provide a useful context for integrating 
different elements of biological diversity (e.g., compo- 
sitional, structural, functional; Franklin 1988) that 
should be considered in delineation of conservation 


334 PREDICTING SPECIES OCCURRENCES 


units at different scales (Noss 1990b). In terms of risk 
Or status assessments, patch-based models provide 
managers with a useful description of the amount and 
distribution of suitable habitat (e.g., patch size, isola- 
tion). Predictions from models of occurrence can pro- 
vide important information to address fundamental 
questions about habitat conservation (e.g., How many 
habitats? How large? Where to focus?). Other patch 
attributes (e.g., occurrence of other species, land 
cover, ownership, climatic data, etc.) can also be 
added to prioritize species recovery actions. 
Patch-based models have provided useful insights 
for a wide variety of species (see Hanski 1999). Ignor- 
ing potential patch structure or applying overly sim- 
plistic or unrealistic habitat classifications may result 
in models with little biological relevance or poor pre- 


dictive power. The past several years have seen a dra- 
matic increase in our capacity to generate predictor 
variables (e.g., via GIS, online databases, etc.) and bet- 
ter analytical models to predict species occurrence. As 
our ability in these important areas increases, how- 
ever, we should not lose sight of the biological re- 
sponses we wish to understand and predict. 
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Autologistic Regression Modeling of 
American Woodcock Habitat Use with 
Spatially Dependent Data 


David S. Klute, Matthew J. Lovallo, and Walter M. Tzilkowski 


A utocorrelation is a frequently observed character- 
istic of spatial variables but is rarely considered 
in models of wildlife-habitat relationships. Random 
variables are said to be spatially autocorrelated if 
neighboring points are more (positive autocorrelation) 
or less (negative autocorrelation) similar than would 
be expected for random groups of observations. Thus, 
the value of a spatially autocorrelated random vari- 
able can be in part predicted by values of the variable 
at neighboring locations (Legendre 1993). The struc- 
turing of environmental elements in ecologically func- 
tional forms often results in positive spatial autocorre- 
lation, and many ecological principles and theories 
(e.g., competition, population growth, species-habitat 
relationships) rely on an assumption of spatial de- 
pendencies among interacting elements (Legendre and 
Fortin 1989). Therefore, explicit consideration of spa- 
tial dependencies is fundamental to a complete under- 
standing of ecological processes. 

Ecologists frequently use classical statistical tech- 
niques (e.g., t-tests, linear and logistic regression) that 
assume independence among data points (Rossi et al. 
1992). Spatial autocorrelation violates the assumption 
of independence and can result in incorrect inference 
when using classical statistical techniques. Positive 
spatial autocorrelation results in overestimating the ef- 
fects of ecological covariates in descriptive or predic- 
tive models and in declaring too often that significant 


differences exists when in fact they do not (Cliff and 
Ord 1981; Augustin et al. 1996). Correct inference 
can be drawn by extracting spatial dependencies and 
then analyzing residuals or by explicitly modeling spa- 
tial autocorrelation (Legendre and Fortin 1989). 

Researchers have used a variety of techniques to de- 
velop species-habitat relationships models using spa- 
tial data. Logistic regression (Pereira and Itami 1991; 
Mladenoff et al. 1995), Mahalanobis distance (Clark 
et al. 1993b), log-linear (Homer et al. 1993), and 
Bayesian techniques (Milne et al. 1989; Aspinall and 
Veitch 1993) have been used to construct models of 
species’ presence or abundance as a function of eco- 
logical covariates. However, none of these studies ex- 
plicitly examined or accounted for spatial dependen- 
cies in response or predictor variables. The extensive 
use and increased availability of spatial databases and 
geographic information systems (GIS) warrants in- 
creased awareness and application of techniques for 
dealing with spatial dependencies. 

Ecologists are often interested in modeling binary 
responses (e.g., present or absent, successful or unsuc- 
cessful). Logistic regression models the log-odds (logit) 
of the response variable as a linear function of predic- 
tor variables and is well suited to binary response 
data. However, logistic regression requires independ- 
ence of observations and is not appropriate for use 
with spatially autocorrelated data. An appropriate 
method for modeling binary responses in the presence 
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of spatial dependencies is autologistic regression. The 
autologistic model incorporates spatial autocorrela- 
tion by conditioning the response for a given location 
on the values of the response at neighboring locations 
(Gumpertz et al. 1997). As with standard logistic re- 
gression, autologistic models can also use relevant co- 
variates as predictors. By explicitly accounting for 
spatial dependencies, more parsimonious models that 
provide a better indication of the importance of pre- 
dictor variables in influencing the distribution of the 
response variable are expected (Augustin et al. 1996). 
Autologistic regression is a flexible technique and has 
been used in ecological applications to model the spa- 
tial distribution of bark-beetles (Preisler 1993), disease 
in bell peppers (Gumpertz et al. 1997), habitat suit- 
ability for red deer (Cervus elaphus) (Augustin et al. 
1996) and the distribution of plant species (Huffer 
and Wu 1998). 

Systematic sampling techniques are often used to 
assess the presence or abundance of wildlife species. It 
is expected that response and predictor variables from 
systematic surveys will frequently be spatially autocor- 
related. American woodcock (Scolopax minor) are a 
highly secretive species, although courtship displays 
by males can be easily observed. However, observa- 
tion of courtship activity is temporally limited both 
daily and seasonally. Because of these limitations, sys- 
tematic roadside surveys (i.e., singing-ground surveys) 
consisting of ten observation points (i.e., “stops”) are 
used to detect woodcock habitat use and monitor pop- 
ulations (Bruggink 1997). Because of their systematic 
structure, singing-ground survey results and associated 
habitat variables are expected to be highly spatially 
dependent. The spatial distribution of woodcock on 
singing-ground survey routes may be due to the effects 
of habitat variables or, alternatively, may be depend- 
ent on the response status of neighboring locations. 
Because singing-ground survey stops within routes are 
not spatially independent, construction of habitat rela- 
tionships models must occur at the route level or spa- 
tial dependencies must be explicitly modeled at the 
stop level. Autologistic regression provides a tech- 
nique that explicitly accounts for the spatial depend- 
encies resulting from route-based survey techniques 
while allowing investigation of relationships between 
woodcock presence and associated habitat variables. 


We investigated the relative performance of logistic 
and autologistic regression for modeling habitat rela- 
tionships of American woodcock based on singing- 
ground surveys. Our objectives were to (1) quantify 
the degree of autocorrelation in response and predic- 
tor variables from singing-ground surveys and re- 
motely sensed habitat data, (2) determine the effec- 
tiveness of logistic regression and autologistic 
regression for modeling spatial dependencies and 
habitat relationships, and (3) compare the reclassifica- 
tion performance of logistic regression and autologis- 
tic regression models to determine their potential use- 
fulness as predictive models of woodcock habitat 
suitability. Our hypothesis was models that explicitly 
incorporated spatial dependencies would provide a 
more appropriate view of woodcock-habitat relation- 
ships and provide better predictive power than could 
be obtained by using methodologies that did not ex- 
plicitly incorporate spatial dependencies. 


Study Area and Methods 


Singing-ground surveys were used to detect presence 
of singing male woodcock (Tautin et al. 1983). Routes 
were randomly selected from an existing road data- 
base using the ArcInfo GIS software (Environmental 
Systems Research Institute, Redlands, Calif.). In 1996, 
sixty-seven singing-ground survey routes were sur- 
veyed, primarily in the Ridge and Valley Province of 
Pennsylvania (Cuff et al. 1989). Sixty-one of these 
routes were surveyed twice and six were surveyed 
once. In 1997, twenty-nine different singing-ground 
survey routes were surveyed. All twenty-nine of these 
routes were surveyed only once. Starting points of 
routes were located at easily identifiable points (e.g., 
road intersection). Ten survey stops were located 
along each route at 0.64-kilometer intervals. 

All singing-ground surveys were conducted by per- 
sonnel from the Pennsylvania Cooperative Fish and 
Wildlife Research Unit, Pennsylvania Game Commis- 
sion, or by experienced volunteer observers between 
15 April and 5 May. Surveys began twenty-two min- 
utes after local sunset if the sky was less than or equal 
to 75 percent overcast and fifteen minutes after local 
sunset if the sky was more than 75 percent overcast. 
Surveys were not conducted if the temperature was 
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less than 4.4 degrees Celsius, if wind speeds were 
greater than 15 kilometers per hour, or if it was rain- 
ing. Observers listened for two minutes and recorded 
numbers of singing males heard at each stop. If one or 
more woodcock was detected, the stop was classified 
“present”; otherwise, the stop was classified “absent.” 


Habitat Variables 


Habitat variables associated with each singing-ground 
survey stop were measured within 350-meter radius 
circular buffers centered on each stop. We chose a 
buffer of 350 meters because it represents the approx- 
imate maximum detection distance for singing male 
woodcock (Duke 1966). Buffers were constructed 
using ArcInfo software. We used classified thematic 
mapper imagery (30x30-meter resolution) developed 
for the Pennsylvania Gap Analysis project to identify 
coarse-scale habitat elements associated with the 
singing-ground survey stops. The imagery was classi- 
fied to eight habitat types (water, coniferous forest, 
mixed forest, broadleaf forest, early successional, 
perennial herbaceous, annual herbaceous, terrestrial 
unvegetated) using an unsupervised classification 
method (W. L. Myers, School of Forest Resource, 
Pennsylvania State University personal communica- 
tion). We used the FRAGSTATS spatial pattern analy- 
sis program to measure five landscape-level and eight- 
een class-level habitat variables. Landscape-level 
metrics described the spatial pattern of the entire land- 
scape by considering all habitat types simultaneously. 
Class-level metrics described the spatial pattern within 
a landscape of a single habitat type (McGarigal and 
Marks 1995). 


Spatial Autocorrelation and Habitat Modeling 


We initially constructed a logistic regression model 
following the protocol and terminology of Hosmer 
and Lemeshow (1989). We fit univariate logistic re- 
gression models and examined likelihood ratio tests 
(G tests) for each variable (PROC LOGISTIC; SAS In- 
stitute 1985). Variables with P less than 0.25 for G 
tests were retained for further analysis (Bendel and 
Afifi 1977; Mickey and Greenland 1989). Variables re- 
tained from univariate analyses were subjected to cor- 
relation analysis (PROC CORR; SAS Institute 
1990a,b) to detect collinearity. Pairs of variables with 


r greater than 0.50 were considered for elimination. 
The decision of which variable to eliminate from each 
correlated pair was based on significance of the uni- 
variate G tests; the more significant variable was re- 
tained. After correlation analysis, a multivariate logis- 
tic regression model was fit using the remaining 
variables. Variables with P less than 0.25 (Wald chi- 
square) were eliminated from the initial multivariate 
model. All two-way interactions among remaining 
variables were individually considered. Variables were 
mean centered before construction of interaction 
terms to reduce collinearity between main and interac- 
tion effects (Neter et al. 1996). Interaction terms with 
P less than 0.25 from G tests were retained for the 
final model. 

We constructed correlograms of Moran's I to inves- 
tigate the degree of spatial autocorrelation in response 
and predictor variables at various scales of influence 
(Moran 1950; Cliff and Ord 1981). We used the S- 
Plus spatial statistics module (MathSoft, Seattle, 
Wash.) to calculate Moran's I for the binary response 
(present or absent) and main effect terms from the lo- 
gistic regression model (Kaluzny et al. 1996). Moran's 
I was calculated in ten distance categories: 0-700, 
700-1,400, . . . , 6,300-7,000 meters. We used 700- 
meter intervals because it was the approximate dis- 
tance between adjacent stops. We calculated one thou- 
sand random permutations of the data to determine if 
spatial autocorrelation for variables was significantly 
different from zero (P less than 0.05) in all distance 
categories. 

We used the logistic regression model as a starting 
point for fitting autologistic models. Logistic regres- 
sion modeled the logit of probability of presence 
of woodcock (p;) as a linear function of predictor 
variables: 


t 


logit(p;) = Hf i. | = Bo + ByXsy +... + BL Xni- 


Where p; was the probability of presence for American 
woodcock at the ith stop, By was the model intercept, 
Bi, - - - , Bn were parameter estimates, and X7,,..., 
X,; were values of predictor variables at the ith stop. 
Autologistic regression was similar, with an additional 
term (called the autocovariate) added to account for 
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spatial dependencies in the data by conditioning the 
probability of presence (p;) on neighboring stops: 


logit(p;) = By + B,X,;+...+ B,X,; + B, autocov; 


where 


auto cov; = 


was the value of the autocovariate for stop i and B,, 
was the parameter estimate for the autocovariate. If 
woodcock were present at stop j then y; = 1, otherwise 
y; = 0. The autocovariate in our models represented 
the weighted average of the number of singing-ground 
survey stops with woodcock present among a set (i.e., 
clique) of k; neighbors of stop i. The weight 


oe: 


ij 
given to stop j was the inverse of the Euclidean dis- 
tance (5j) between stops i and j (Augustin et al. 1996). 
Autologistic models were fit using PROC LOGISTIC 
(SAS Institute 1990). 

Because we could not determine a priori what 
clique size was most appropriate, we constructed mul- 
tiple cliques to investigate the effect of size of neigh- 
borhood on the autologistic model. We used the 
Akaike Information Criterion (AIC) to identify the 
clique size that produced the most parsimonious au- 
tologistic model. We defined the first order clique as 
all neighbors within 700 meters of stop i. Additional 
cliques were constructed by adding neighbors at 700- 
meter intervals (second-order clique - all stops within 
1,400 meters, third-order clique = all stops within 
2,100 meters, etc.) until the most parsimonious model 
was identified. Cliques were identified using the S-Plus 
spatial statistics module (Kaluzny et al. 1996). 

To investigate remaining spatial autocorrelation after 
model fitting, we constructed correlograms of Moran’s 
I for Pearson residuals from logistic and autologistic re- 
gression models. Autocorrelation analysis of residuals 
was conducted using the same methods used for re- 
sponse and predictor variables. We assessed differences 


in modeling efficiency by examining standard reclassifi- 
cation statistics. We calculated the cutoff probability for 
reclassification by determining the proportion of the 
total stops with woodcock present (cutoff probability = 
stops with woodcock/total stops). We defined sensitivity 
as the percentage of absent-stops (i.e., no woodcocks 
observed) correctly classified, specificity as the percent- 
age of present-stops (i.e., one or more woodcocks ob- 
served) correctly classified, and overall correct as the 
percentage of all stops correctly classified. Errors of 
commission measured the percentage of absent-stops 
incorrectly classified (100-sensitivity) and errors of 
omission measured the percentage of present-stop in- 
correctly classified (100-specificity). 


Results 


Woodcock were observed at 103 of the 960 total 
singing-ground survey stops. Woodcock observations 
tended to be clustered. No woodcock were observed 
on fifty-four singing-ground survey routes, and all 
present-stops occurred on the remaining forty-two 
routes. On routes where woodcock were observed, the 
average number of present-stops was 2.45. 

Preliminary screening of univariate logistic regres- 
sion models indicated that twelve habitat variables 
were potentially important predictors of woodcock 
presence (P less than 0.25, Table 27.1). Through cor- 
relation analysis and multivariate model fitting, we 
converged on a final logistic regression model consist- 
ing of four main-effect terms and two 2-way interac- 
tions (Table 27.2). All terms in the final logistic regres- 
sion model were class-level percent cover variables 
and the model was significant at P less than 0.001. 

Moran’s I correlograms indicated significant spatial 
autocorrelation for response and predictor variables 
from the logistic regression model (Fig. 27.1). Wood- 
cock presence was most strongly autocorrelated at the 
smallest distance categories (Fig. 27.1A). The percent 
cover of water was significantly autocorrelated at the 
four smallest distance categories (Fig. 27.1E); how- 
ever, other habitat variables were significantly auto- 
correlated over a larger range of distances (Fig. 
27.1B-D). 

We fit autologistic regression models with cliques 
of neighbors less than or equal to 700 meters 
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TABLE 27.1. 


Means, standard errors (SE), and P-values for Gtests from univariate logistic regression models for 
habitat variables associated with American woodcock (Scolopax minor) singing-ground survey stops in 


Pennsylvania, 1996-1997. 


Present-stops? Absent-stopsh 
Habitat variable Mean SE Mean SE P 
Landscape-level indices l 
Number of patches 38.44 skake 38.73 0.52 0.850 
Mean patch size (ha) aL lA 0.04 1827. 0.03 0.066 
Edge (km) Bibi 0.20 8.48 0.85 0.918 
Shannon diversity IRAT 0.03 1525 0.01 0.545 
Interspersion/juxtaposition 65u 0.94 63.66 0.39 0.196 
Class-level indices 
Percent cover 
Water SuSE 1.08 0.79 0.12 «0.001 
Coniferous forest 6.82 dL al m55 0.36 0.501 
Mixed forest 10.47 1.28 11.00 0.44 0.685 
Broadleaf forest 40.71 MID 39.70 0.81 0.682 
Early successional 14.06 2895 10.16 0.33 «0.001 
Perennial herbaceous 5. 1.44 16.50 0:53 0.405 
Annual herbaceous 8.19 L1G 12.27 0.55 0.008 
Terrestrial unvegetated 1027 0.26 1.92 0.14 0.071 
Number of patches 
Water 0.72 0.15 0.63 0.06 0.632 
Broadleaf forest 6.26 0.34 592 0.16 OMA; 
Early successional 10.50 0.54 9.67 oa 0.191 
Terrestrial unvegetated ; AL S45) 0.20 2.04 0.09 0.008 
Edge (km) 
Broadleaf forest 4.91 0.16 4.46 0.07 0.022 
Early successional 3.59 0.21 2.83 0.70 <0.001 
Mean patch size (ha) 
Water 1.09 0.40 0.21 0.04 <0.001 
Broadleaf forest 15) Als) 0.69 5M3 0.30 0.506 
Early successional 0.89 0.07 0.47 0.05 0.447 
Terrestrial unvegetated 0.19 0.06 0.19 0.01 0.966 


aNumber of woodcock observed > 0, n = 103. 
bNumber of woodcock observed = 0, n = 857. 


(first order), less than or equal to 1,400 meters (sec- 
ond order), less than or equal to 2,100 meters (third 
order), and less than or equal to 2,800 meters 
(fourth order). The third-order autologistic regression 
model provided the most parsimonious fit based on 
AIC values. The relative magnitudes of parameter esti- 


mates for habitat variables in the autologistic model 
were consistently less than in the logistic regression 
model (Table 27.2). 

Moran’s I correlograms indicated significant auto- 
correlation of Pearson residuals for small-distance cat- 
egories after fitting the logistic regression model (Fig. 
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TABLE 27.2. 


Parameter estimates from logistic regression and autologistic regression models for predicting American woodcock (Scolopax 
minor) presence from habitat variables associated with singing-ground survey stops in Pennsylvania, 1996-1997. 


Parameter estimates 


Ann. herb. 
Water x x Terr. 
Model Intercept Water? Early succ.2 Ann. herb.? Terr. Unveg.? Early succ.^ — unveg.^ X Autocov.^ 
Logistic regression -2.134 0.043 0.033 -0.022 -0.113 0.006 -0.016 — 
Autologistic regression -3.073 0.003 0.029 -0.014 -0.083 0.005 -0.016 4.679 


aMain effect terms: Water = Water (%), Early succ. = Early successional (%), Ann. herb. = Annual herbaceous (%), Terr. unveg. = Terrestrial 


unvegetated (%). 
bTwo-way interactions of mean-centered main effect terms. 
cAutocovariate term. 
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Figure 27.1. Moran's / correlograms of response and predic- 
tor habitat variables used in the logistic regression and autolo- 
gistic regression models for predicting American woodcock 
(Scolopax minor) presence from habitat variables associated 
with singing-ground survey stops in Pennsylvania, 1996-1997. 
Filled circles indicate / values significantly different from zero 
(P « 0.05). 


B: Autologistic regression 
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Figure 27.2. Moran's / correlograms of Pearson residuals 
from logistic regression and autologistic regression models for 
predicting American woodcock (Scolopax minor) presence from 
habitat variables associated with singing-ground survey stops 
in Pennsylvania,- 1996-1997. Filled circles indicate / values 
significantly different from zero (P « 0.05). 


27.2A). The third-order autologistic regression model 
effectively accounted for the majority of spatial auto- 
correlation and completely removed all significant 
positive spatial autocorrelation (Fig. 27.2B). 

Reclassification statistics performed better for au- 
tologistic models than for the logistic regression 
model. Model sensitivity, specificity, and percent cor- 
rect were 63.1, 56.3, and 62.4 percent, respectively, 
for the logistic regression model. Errors of commis- 
sion and omission were 36.9 and 43.7 percent, respec- 
tively, for the logistic regression model. Model sensi- 
tivity, specificity, and percent correct were 83.4, 66.0, 
and 81.6 percent, respectively, for the autologistic re- 
gression model. Errors of commission and omission 
were 16.6 and 34.0 percent, respectively, for the autol- 
ogistic regression model. 
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Discussion 


Models of wildlife-habitat relationships may be af- 
fected by spatial autocorrelation in two ways. First, 
neighboring locations may tend to exhibit similar en- 
vironmental conditions due to spatial proximity re- 
sulting in positive autocorrelation of predictor vari- 
ables. Second and independent of environmental 
conditions, the occurrence of the species of interest 
may not be independent of the occurrence of con- 
specifics at neighboring locations due to behavioral 
processes resulting in autocorrelation of the response 
variable (Augustin et al. 1996). Scale of measurement, 
species, and behavioral processes will influence 
whether response variables will exhibit positive auto- 
correlation (e.g., flocking, lekking) or negative auto- 
correlation (e.g., territoriality, competition). In both 
situations, model residuals are expected to exhibit sig- 
nificant spatial autocorrelation if spatial dependencies 
are not effectively modeled. Therefore, inference based 
on inappropriate statistical analyses may lead to inac- 
curate conclusions about species-habitat relationships. 

Our analyses indicated significant positive spatial 
autocorrelation of both response and predictor vari- 
ables. Due to the systematic nature of singing-ground 
surveys, these patterns of autocorrelation were ex- 
pected. If woodcock occurred on a singing-ground 
survey route, they were commonly detected at more 
than one stop. Stops on singing-ground survey routes 
were spaced 0.64 kilometers apart, and thus duplicate 
counting of individuals at adjacent stops should have 
been minimal (Tautin et al. 1983). Zero counts, which 
are common on singing-ground survey routes, and 
clustering of present-stops within routes produced sig- 
nificant positive spatial autocorrelation. The correlo- 
gram of presence/absence indicated significant but rel- 
atively weak autocorrelation within small distance 
categories. Other researchers have noted clumped dis- 
tributions of woodcock on singing-ground survey 
routes (Tautin 1982). Clustering of woodcock on 
singing-ground survey routes likely resulted from 
neighboring sites with similar habitat conditions. 
However, clustering of woodcock may have also re- 
sulted from behavioral processes related to courtship 
behavior (i.e., lekking, male-dominance polygyny) 


(Hirons and Owen 1982; Oring 1982; Dwyer et al. 
1988; Ellingwood et al. 1993). 

Habitat covariates in our logistic regression model 
also exhibited positive spatial autocorrelation. Auto- 
correlation of water was initially strong but declined 
quickly. Water elements along singing-ground survey 
routes were limited primarily to streams; therefore, 
only short-range positive autocorrelation was ex- 
pected due to the spatially limited nature of streams. 
Autocorrelation of the percentage of annual herba- 
ceous, terrestrial unvegetated, and early successional 
cover persisted across longer distances. We used rela- 
tively coarse-grained habitat information (30x30- 
meter pixels), and this limited our ability to detect 
fine-scale changes along singing-ground survey routes. 
Fine-scale habitat characteristics may change signifi- 
cantly within a region although coarse-scale habitat 
variables remain relatively consistent. 

In some situations, ecological covariates alone may 
be sufficient for eliminating spatial dependencies in 
model residuals (Gumpertz et al. 1997). However, the 
Pearson residuals from our logistic regression model 
exhibited significant positive spatial autocorrelation at 
small distances indicating logistic regression was inef- 
fective for completely accounting for autocorrelation 
of response and predictor variables. Use of autologis- 
tic regression reduced spatial autocorrelation in model 
residuals. Examination of the correlogram for the au- 
tologistic model showed all positive spatial autocorre- 
lation had been adequately modeled. Small but signifi- 
cant negative spatial autocorrelation persisted for 
stops 700 meters or less apart. When positive spatial 
autocorrelation is present at small distances, statistical 
techniques that do not account for the autocorrelation 
will result in inflated parameter estimates and declar- 
ing too often that results are significant (Cliff and Ord 
1981; Legendre and Fortin 1989). 


Habitat Relationships 


Fine-scaled studies have demonstrated that American 
woodcock exhibit positive associations with early- 
successional mesic forests, wetlands, and water, and 
they exhibit negative associations with agricultural and 
urbanized lands (Kinsley et al. 1980; Gutzwiller et al. 
1983; Hudgins et al. 1985; Straw et al. 1986). Vari- 
ables in our logistic regression model were consistent 


342 PREDICTING SPECIES “OCCURRENCES 


with habitat variables known to be associated with 
woodcock singing-ground use. Variables positively as- 
sociated with woodcock presence in our logistic regres- 
sion model were the percent cover of water and early 
successional habitats and their interaction. Variables 
negatively associated with woodcock presence were the 
percent cover of annual herbaceous and terrestrial un- 
vegetated habitats and their interaction. No class-level 
or landscape-level habitat heterogeneity indices were 
selected as significant predictors in our logistic regres- 
sion model. Klute (1999) reported coarse-grained, per- 
cent-cover variables were most strongly associated 
with woodcock presence when measured at small spa- 
tial extents (i.e., small buffers). Landscape heterogene- 
ity indices exhibited stronger associations when meas- 
ured at large spatial extents (i.e., large buffers). The 
importance of percent-cover variables in our models 
have been shown to be a direct result of the size of the 
buffer in which habitat variables were measured (Klute 
1999): 

The logistic regression model appeared to give a 
reasonable picture of coarse-scale habitat relationships 
for American woodcock, based on known fine-scale 
habitat preferences. However, the logistic regression 
model was not appropriate for these data as indicated 
by the spatial autocorrelation in model residuals. The 
autologistic model effectively accounted for all posi- 
tive spatial autocorrelation (Fig. 27.2). Furthermore, 
autologistic regression is expected to be an appropri- 
ate model because spatial dependencies may have re- 
sulted from both the distribution of habitat variables 
and processes related to the breeding behavior of male 
woodcock. 

When true spatial dependencies are ignored, pa- 
rameter estimates in a logistic regression model are ex- 
pected to be inflated, because part of the effect that is 
due to the spatial dependence between neighboring lo- 
cations is attributed to other predictor variables (Au- 
gustin et al. 1996; Gumpertz et al. 1997). When inter- 
preting results from our autologistic model, it is 
important to recognize that parameter estimates repre- 
sent the effect of habitat variables given the response 
status of neighboring stops. This effect may be differ- 
ent from the unconditional effect of the habitat vari- 
ables (Gumpertz et al. 1997). For all habitat variables, 
introducing the autocovariate resulted in the expected 


decrease in the magnitude of the parameter estimates. 
The parameter estimate for percent cover of water ex- 
hibited the largest decrease. The importance of water 
for predicting American woodcock presence may be 
much less than expected for coarse-scaled studies, 
after the spatial dependence on neighboring stops is 
considered. All other main effects also showed de- 
creased importance when spatial autocorrelation was 
explicitly modeled. If we had ignored the spatial de- 
pendencies in our data and modeled using only logis- 
tic regression, we would have overemphasized the im- 
portance of all habitat variables for predicting 
woodcock presence. Such a mistake could lead to in- 
accurate predictive models, over- or underestimates of 
suitable woodcock habitat, and poor information for 
use in regional management decisions. 


Model Prediction 


Examination of reclassification statistics allowed us to 
determine the potential usefulness of logistic regres- 
sion versus autologistic regression as predictive mod- 
els of woodcock habitat availability. The overall per- 
centage of routes correctly reclassified was higher for 
autologistic models than for logistic regression mod- 
els. If our models were to be used for predicting the 
distribution of woodcock habitat, we expect that the 
autologistic model would provide more-accurate pre- 
diction because it explicitly accounts for spatial de- 
pendencies and more accurately models habitat rela- 
tionships. Model sensitivity (percentage of correctly 
classified absent stops) increased substantially between 
the logistic regression and the autologistic regression 
models. Model specificity (percentage of correctly 
classified present stops) exhibited smaller increases be- 
tween the logistic regression and the autologistic re- 
gression models. Several factors may have contributed 
to incorrect classification of both present and absent 
stops. First, all relevant habitat variables may not have 
been included. Second, because each survey route was 
visited only one or two times, woodcock may not have 
been detected at some stops when they were actually 
present. This result would have directly contributed to 
incorrect classification of absent-stops. Undetected . 
woodcock would have also contributed to miscalcula- 
tion of the autocovariate term in the autologistic mod- 
els because it would have affected the characteristics 
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of the cliques of some stops. Third, woodcock may 
not have been detected at some stops with suitable 
habitat because not all suitable sites are inhabited by 
woodcock in a given year. Perfect detection and com- 
plete use of suitable habitat would be needed to re- 
duce reclassification errors. Regardless of the sources 
of reclassification errors, an explicit consideration of 
spatial dependencies improved the predictive capabili- 
ties of our models. 


Conclusions 


Our analyses demonstrated the importance of explic- 
itly considering spatial autocorrelation in the develop- 
ment of models of species-habitat relationships. Sys- 
tematic sampling often provides a simple, effective, 
and inexpensive method for determining presence and 
abundance of wildlife species; however, results from 
systematic sampling techniques will frequently result 
in spatial autocorrelation of response and predictor 
variables. Moreover, spatial autocorrelation is an in- 
herent and necessary component of functional ecolog- 
ical systems, and understanding spatial structure is 
necessary for a complete understanding of biological 
populations and habitat selection. Autologistic re- 
gression is an appropriate technique for use with a 
categorical response variable and continuous and cate- 


gorical predictor variables that exhibit spatial depend- 
ence. Furthermore, autologistic regression provides es- 
timates of both the strength of species-habitat rela- 
tionships and the strength of dependence among 
spatially neighboring areas, resulting in a more com- 
plete description of the factors influencing the distri- 
bution of organisms in the environment. This infor- 
mation may be used by land managers to more 
accurately identify areas of high habitat suitability and 
to better understand the importance of habitat ele- 
ments for the species of interest. 
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A Neural Network Model for Predicting 
Northern Bobwhite Abundance in the 
Rolling Red Plains of Oklahoma 


Jeffrey J. Lusk, Fred S. Guthery, and Stephen J. DeMaso 


ive... predictions of species abundance 
are necessary for management and conserva- 
tion to be effectively implemented (Leopold 1933; Pe- 
ters 1992; Schneider et al. 1992). Such predictions are 
increasingly important as human impacts on the envi- 
ronment increase. Artificial neural network (ANN) 
models are extremely powerful and allow the investi- 
gation of linear and nonlinear responses. As such, 
ANN models offer ecologists a powerful new tool for 
understanding the ecologies of declining species, 
which can lead to more-effective management (Colas- 
anti 1991; Edwards and Morse 1995; Lek et al. 
1996b,c; Lek and Guégan 1999). 

Current applications of ANN models include statis- 
tical modeling (Smith 1996). In this capacity, ANN 
models have considerable advantages over traditional 
statistical models, such as regression. Artificial neural 
networks are extremely powerful due to their capacity 
to learn from the data used during training. Another 
advantage of ANN models over traditional models is 
that ANNs are inherently nonlinear (Haykin 1999:2). 
Because most ecological phenomena are nonlinear 
(Maurer 1999:110), this property of ANN models 
makes them more useful than standard statistical 
models that are often limited to linear relationships 
(Lek et al. 1996b). Even minor nonlinearities in the re- 
sponse of one variable to another can reduce the pre- 
dictive power of traditional statistical techniques 


(Paruelo and Tomasel 1997). Neural networks also do 
not require any a priori knowledge of the nature of 
the relationship between predictor and response vari- 
ables, which makes available nonlinear methods cum- 
bersome (Smith 1996:19—20). ANNs find the form of 
the response in the data presented to them and, as 
such, are not constrained to simple curves, as are 
curvilinear regression techniques (Pedhazur 1982:406; 
Smith 1996:20). Finally; ANN models are nonpara- 
metric (Smith 1996:20). Use of non-normal data for 
neural model development will not bias the results 
(Baran et al. 1996). 

We developed an artificial neural network model to 
investigate the influence of weather patterns on the 
abundance of northern bobwhites (Colinus virgini- 
anus; bobwhites hereafter) in a semiarid region of 
western Oklahoma, United States. An understanding 
of the effects of weather on species abundances is war- 
ranted in the light of global climate change (Root 
1993; Schneider 1993). We also sought to evaluate the 
ANN modeling technique. Specifically, we (1) com- 
pared ANN model output with that of a traditional 
multiple regression model, (2) determined which 
model was better by using a sums of squares criterion 
(Hilborn and Mangel 1997), and (3) conducted simu- 
lation modeling using the ANN and regression 
models. 

Much is known about bobwhite ecology, so it 
offers an effective means of evaluating the ANN 
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technique and its applicability to management and 
conservation. Furthermore, an understanding of bob- 
white-climate relationships is an important compo- 
nent of management and conservation of bobwhites. 
Bobwhite abundance has declined over much of their 
range during the past several decades (Koerth and 
Guthery 1988; Brennan 1991; Church et al. 1993; 
Sauer et al. 1997). Bobwhite declines may be acceler- 
ated by climate change in some regions of their range 
(Guthery et al. 2000). Although we cannot manage 
the weather, we can factor in its effects when making 
management plans. By working in cooperation with 
state management agencies, the results of our research 
can be directly and immediately applied in the field, 
completing the research-management cycle (Hejl and 
Granillo 1998; Kochert and Collopy 1998; Young and 
Varland 1998. 


Methods 


We modeled bobwhite abundance in the Rolling Red 
Plains ecoregion of Oklahoma. This ecoregion is in 
western Oklahoma, excluding the panhandle (Peoples 
1991), and occupies 5.7 million hectares. Mean an- 
nual precipitation is 58 centimeters (Oklahoma Cli- 
matological Survey unpublished data). 

Biologists from the Oklahoma Department of 
Wildlife Conservation counted bobwhites in each 
county in Oklahoma. Survey routes were established 
in typical quail habitat (Peoples 1991). Each 32- 
kilometer route was surveyed twice annually begin- 
ning in 1991: once in August and once in October. 
Surveys were conducted at either sunrise or one hour 
before sunset. Total number of bobwhites observed 
per 32-kilometer route was used as an index of bob- 
white abundance. Although roadside counts such as 
these are prone to biases, these surveys are positively 
related to the fall harvest in Oklahoma (r » 0.70, S. 
DeMaso unpublished data). 


Artificial Neural Networks 


Artificial neural networks are mathematical algo- 
rithms developed to imitate the function of brain cells 
for the study of human cognition (Hagan et al. 1996; 
Smith 1996:1; Haykin 1999:6-9). However, early 
techniques were handicapped by their inability to han- 


dle nonlinear relationships (Hagan et al. 1996:1-4; 
Smith 1996:8). In the 1980s, neural network modeling 
experienced a renaissance of sorts with the develop- 
ment of a backpropagation algorithm (see below) that 
is capable of handling nonlinear relationships (Smith 
1996:20). 

Because of their foundations in cognitive science, 
many of the terms used to describe aspects of ANNs 
are derived from neurobiology. What follows is a 
short explanation of the terminology of neural net- 
work modeling and a brief description of how a typi- 
cal neural model works. A neural network typically 
consists of three layers: the input nodes, the neurons 
(also called hidden nodes or processing elements), and 
the output nodes. However, ANNs with more than 
one neuron layer are possible. Typically, each node in 
each layer is connected to each node in the previous 
layer by synapses (connection weights), and, as such, 
is termed fully connected (Smith 1996:21). The 
synapses store the information learned by the model 
(Haykin 1999) and are analogous to regression coeffi- 
cients (Heffelfinger et al. 1999). Each input node rep- 
resents an independent variable. Values of input nodes 
are scaled so that they range between zero and one 
(Smith 1996:67). Each neuron processes the input 
nodes by computing a logistic function from the sum 
of the inputs:. 


Lu INT 
1l+e 


glu) F7 
where u is the weighted sum of the inputs (w;x;) plus a 
bias weight (wp): 


J 
j=1 


(Smith 1996:40). The logistic function above is the 
most widely used but is not the only function available 
(Smith 1996:35). The values calculated by the neu- 
rons, g(u), are transferred to the output nodes. The 
output nodes perform a similar calculation and their 
output is detransformed to obtain a prediction of the 
independent variable (Smith 1996:22). In backpropa- 
gation ANNs, the error between the predicted output 
and the actual output is calculated and propagated 
back through the model where it is used to adjust the 
values of the synaptic weights according to one of a 
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variety of learning rules (Hagan et al. 1996:11—40; 
Smith 1996:67). The adjustment of the synapses is 
termed learning (Smith 1996:59). This process contin- 
ues iteratively, with synapses adjusted after each for- 
ward pass, and is termed training. With each iteration, 
the ANN learns more about the relationship between 
inputs and outputs and, therefore, the prediction error 
decreases. Training is stopped before the model maps 
the relationship between inputs and outputs exactly. 
When this occurs, the network is said to be over- 
trained and the model’s predictive abilities are dimin- 
ished when presented with novel data (Hagan et al. 
1996:11-22, Smith 1996:113). The use of ANNs in 
the ecological sciences requires predictability, and 
there is a trade-off between model generality and ac- 
curacy of prediction. 

Because ANN models begin training with randomly 
selected connection weights, the minimum error 
achieved by a network may not be the global mini- 
mum, but only a local minimum (Smith 1996:62). 
Therefore, an error minimum lower than the one 
achieved by the network may exist. However, Smith 
(1996:62) reported that the probability of such local 
minima existing decreases as more neurons are added 
to the model. Determining the optimum number of 
neurons should, therefore, maximize the chances of 
finding the global minimum in the error surface. 


Database Construction 


Roadside counts were initiated in Oklahoma in 1991, 
and, therefore, our database comprised the 1991—1996 
bobwhite surveys. We averaged each year's August and 
October count for our models. The database also in- 
cluded weather and land-use data as independent vari- 
ables. Weather data were obtained on CD-ROM from 
EarthInfo (Boulder, Colo.) We extracted mean 
monthly temperature data for June, July, and August. 
Seasonal precipitation data were calculated from total 
monthly precipitation. We divided the year as follows: 
winter—December, January, and February; spring— 
March, April, and May; and summer—June, July, and 
August. Therefore, seasonal precipitation equaled total 
monthly precipitation averaged for each three-month 
period. We grouped climate data into these periods be- 
cause they represent ecologically important phases of 
the bobwhite’s life cycle (breeding, recruitment, and 


winter survival). We did not include any time lag for 
the effects of rainfall on quail abundance because other 
networks we developed indicated this lag effect was 
not important to model predictions (J. Lusk unpub- 
lished data). We used weather stations closest to each 
survey route for obtaining weather data. As measures 
of land use and human impacts, we used cattle density 
on nonagricultural lands (total head per square kilome- 
ter) and the proportion of county area in agricultural 
crop and hay production (hereafter, agricultural pro- 
duction). We selected these variables because they are 
likely to have the greatest effect on bobwhite abun- 
dance (Murray 1958; Roseberry and Sudkamp 1998). 
Bobwhite abundance in Florida varied directly with 
cultivated acreage and inversely with acreage grazed 
(Murray 1958). These land-use variables were deter- 
mined at the county level and were extracted from the 
Oklahoma Department of Agriculture’s annual crop 
statistics for each survey year in the database. 

The final variable included in the data set was the 
number of bobwhites counted during the previous 
year’s survey. The number of bobwhites present in one 
year is dependent on the number of bobwhites present 
the previous year. Furthermore, survival and repro- 
duction may be density dependent (Roseberry and 
Klimstra 1984). 


ANN Construction, Training, and Validation 


Network architecture. We used a three-layered, back- 
propagation neural network. The network consisted 
of a layer of input nodes representing the independent 
variables, a layer of neurons, and an output node rep- 
resenting the dependent variable. Our model was fully 
connected (Smith 1996:21). We used a commercial 
neural-modeling software package (QNet for Win- 
dows, ver. 97.02, Vesta Services, Winnetka, Ill.) for 
ANN development. Including too many neurons in 
the neuron layer may result in reduced prediction abil- 
ity and including too few will limit the complexity the 
network can accurately learn (Smith 1996:120-123). 
Therefore, we determined the optimal number of neu- 
rons experimentally by training models in which the 
same data set and model parameters were used but the 
number of neurons was varied. We developed models 
that contained two to nine neurons. We limited the 
maximum number of neurons to the number of input 
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variables in the model. We selected the model with 
best performance gauged as the correlation between 
the predicted counts obtained from the model and the 
actual counts in the validation data set. 

Training parameters. We used an adaptive learning 
rule during model training (Smith 1996). In addition, 
three parameters were adjusted to optimize model per- 
formance. These parameters were the number of itera- 
tions, the learning rate, and the momentum. The val- 
ues we selected for the learning rate and momentum 
were within the range of those found to be most effec- 
tive in a wide variety of neural network applications 
(Smith 1996:77-90). The number of iterations con- 
trols how long the model has to learn the pattern and 
relationships among the variables in the model. The 
larger the number of iterations, the more attempts the 
network has to minimize prediction errors. We trained 
our model for ten thousand iterations. We believed 
that ten thousand iterations would allow the network 
to find the error minimum and allow us to stop train- 
ing if the network began to overfit the data. The learn- 
ing rate controls the magnitude of the corrections of 
the synaptic weights per iteration based on the direc- 
tion and magnitude of the change in the prediction 
error during past iterations (Smith 1996:77). Selection 
of too small of a learning rate will increase the num- 
ber of iterations necessary to reach an error minimum. 
However, selection of too large of a learning rate may 
make the network unstable, resulting in oscillations in 
the prediction error (Hagan et al. 1996:5-9). We used 
a learning rate of 0.05. The final network parameter 
was momentum. Momentum determines how many 
past iterations are used in determining synaptic-weight 
adjustments in the current iteration (Smith 1996: 
85-88). Momentum keeps the error corrections mov- 
ing in the same direction along the error surface 
(Smith 1996:85). If a large momentum value is used, it 
will take longer for weight corrections to respond to 
changes in the prediction error. In other words, synap- 
tic weight adjustments are based on the long-term 
trend in prediction error, and momentum determines 
the number of iterations used in determining the long- 
term trend. We used a momentum of 0.90. This mo- 
mentum is appropriate for most types of models 
(Smith 1996:86). 

Validation. To, assess the predictive ability, accu- 


racy, and reliability of our ANN model, we presented 
the trained model with data not used in network train- 
ing. We created a validation data set by extracting 20 
percent of the data from the original data set. Data 
were rank ordered by the number of quail counted, 
and every fifth record was assigned to the validation 
data set. There were ninety-eight records in the origi- 
nal database, resulting in twenty records in the valida- 
tion data set. The systematic removal of the validation 
data allowed us to gauge the performance of the net- 
work over the entire range of the original bobwhite 
count data. Because the validation data were derived 
from the original data set and were, therefore, ob- 
tained under the same conditions as those used for 
network training, the network can be considered only 
validated for this particular ecoregion in Oklahoma 
(Conroy 1993; Conroy et al. 1995). 

In addition to our validation data set, we tested our 
model with data collected in the same ecoregion but 
not part of the training or validation data sets. Be- 
cause this model will eventually be used by managers 
to predict bobwhite abundance, this test will deter- 
mine the utility of the model. We presented the trained 
model with the 1997 data and recorded the accuracy 
of the predictions. 


Regression Analysis 


We performed a multiple regression analysis to com- 
pare ANN performance with that of this traditional 
statistical model. We used the same data set used for 
training and validating the ANN model for the regres- 
sion analysis. The full-model multiple linear regression 
included all the independent variables and the depend- 
ent variable used in the ANN model. We used the sta- 
tistical software package Statistix (Analytical Software 
1996). We used the Student's t-test for determining 
which variables were contributing (P « 0.05) to the 
model predictions (Analytical Software 1996). The 
correlation between each model's predicted and actual 
bobwhite count was used as an indicator of the rela- 
tive performance of each model. 


Model Comparison 


We used the percent contribution of each variable to 
the ANN model's predictions to identify important 
variables (Ozesmi and Ózesmi 1999). The percentage 
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contribution is calculated by dividing the sum-of- 
squared synaptic weights for the variable of interest by 
the total sum-of-squared synaptic weights for all vari- 
ables. We also determined each variable’s contribution 
to the total, unadjusted R2 using a forward stepwise re- 
gression (Wilkinson 1998). We calculated the increase 
in R2 after each variable was entered into the model to 
apportion the amount of variance accounted for to 
each variable. We then divided each individual R2 by 
the total unadjusted R2 for the model. This gave the 
percentage contribution of each variable in the regres- 
sion model to the model's response. This percentage is, 
therefore, homologous to the percent contribution of 
the ANN model. Although these percentage contribu- 
tions are not directly comparable, they allowed us to 
determine what variables were driving each model. 

To determine if the differences in performance were 
due to the increased power of the ANN modeling tech- 
nique, or to the increased parameterization of the ANN 
model, we used a sum-of-squares criterion for model 
comparison (Hilborn and Mangel 1997:114—117). This 
technique adjusts the sum of squared deviations (SS) by 
penalizing parameterization: 


SS 


n—2m 


m 


SS, = 


where SS,, is the sum of squared deviations for the 
model of interest, n is the sample size used to develop 
the model, and m is the number of parameters in the 
model (Hilborn and Mangel 1997:115). This sum-of- 
squares criterion is similar to Mallow's C, (Hilborn and 
Mangel 1997:116). As such, the model with the lowest 
adjusted sum-of-squares is selected as the best predictor 
of the dependent variable (Hilborn and Mangel 
1997:116). The SS deviations for each model were cal- 
culated from the observed and predicted values of the 
bobwhite counts. We calculated the SS from the train- 
ing data only, resulting in an 7 of 78. The ANN model 
had thirty-four parameters (one for each synapse: nine 
inputs times three neurons equals twenty-seven, an ad- 
ditional three synapses connecting each of the neurons 
to the output node, and four bias weights, one for each 
neuron and output node), and the regression model had 
ten parameters (regression coefficients, one for each in- 
dependent variable, and the constant). 


Simulation Analyses 


Following model training and validation, we used sim- 
ulations to explore the effects of each independent 
variable on ANN model predictions (Lek et al. 19962; 
Heffelfinger et al. 1999). This allowed us to further 
evaluate model performance. We constructed simula- 
tion data sets in which one independent variable was 
allowed to vary incrementally between its maximum 
and minimum value and all other variables were held 
constant at their mean value. These data sets were 
then processed through the trained neural network to 
generate predicted bobwhite counts. Predicted counts 
were then plotted against the range of the variable al- 
lowed to vary to determine the response of network 
predictions to that particular variable. 


Results 


We determined that three neurons were optimal for 
the data set. The ANN model accounted for 78 per- 
cent (R2) of the variation in bobwhite counts in the 
training data and 32 percent of the variation in bob- 
white counts in the validation data (Fig. 28.1). The 
lower R2 for the validation data resulted mainly from 
a single outlier (Fig. 28.1). With this outlier removed, 
the amount of variation accounted for by the ANN 
model increased to 52 percent. However, we could 
find no reason for the large prediction error associated 
with this data point and so provide both results here. 
Our test of the network model accounted for 17 per- 
cent of the variation in the 1997 data (R2 = 0.17). The 
full-model regression was not significant and ac- 
counted for 6 percent of the variation in bobwhite 


TABLE 28.1. 


Parsimony analysis of the artificial neural network model and 
the regression model using the adjusted sum-of-squares 
(Hilborn and Mangel 1997). 


Number of Adjusted 
Model parameters Sum-of-squares sum-of-squares 
Artificial 
Neural 
Network 34 2,821.64 282.1 
Regression 10 12,950.90 223.3 
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Predicted Bobwhite Count 


R? = 0.3223 


Predicted Bobwhite Count 


Actual Bobwhite Count 


Figure 28.1. Predicted Northern bobwhite (Colinus virginianus) 
counts from the artificial neural network model plotted against 
the actual values in (a) the training data set and (b) the valida- 
tion data set for the Rolling Red Plains of western Oklahoma. 
The trend line represents the linear model regression of pre- 
dicted bobwhite count on the actual bobwhite count. 


counts (Fo gg = 1.50; P = 0.17; Fig. 28.2). The regres- 
sion model accounted for 37 percent of the variation 
in the validation data set (R2 = 0.37; Fig. 28.2). The 
sum-of-squares criterion indicated that the regression 
model (SS, = 223.3) was the better predictor of bob- 
white abundance than the ANN model (SS, = 282.1; 
Table 28.1). In other words, the increased predictive 
power of the ANN model was not enough to warrant 
increased complexity. 

Although it is not possible to determine statistically 
the significance of the variables in the ANN model, we 
assume that the importance of independent variables 
is related to the magnitude of its contribution to 
predictions. Each of the independent variables con- 
tributed some information to the model predictions 


Predicted Bobwhite Count 
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Figure 28.2. Predicted Northern  bobwhite (Colinus 


virginianus) counts from the full model regression plotted 
against the actual values in (a) the training data set and (b) the 
validation data set for the Rolling Red Plains of western Okla- 
homa. The trend line represents the linear model regression of 
predicted bobwhite count on actual bobwhite count. 


(Table 28.2). Mean August temperature and summer 
precipitation had the highest individual contributions 
to the network outputs, with a combined contribution 
of 32 percent (Table 28.2). The remaining variables 
also contributed to the ANN model's predictions, but 
to a lesser extent (Table 28.2). There was one variable 
significant to the regression model: winter precipita- 
tion (Table 28.2). Winter precipitation also accounted 
for 54 percent of the total R2 of the regression model 
(Table 28.2). Only spring precipitation and the previ- 
ous year's bobwhite counts contributed more than 10 
percent to the overall R2 (15 and 11 percent, respec- 
tively; Table 28.2). The density of cattle on nonagri- 
cultural land contributed nothing to the overall R2. 
The Student's t-test we used to determine significant 
variables in the regression model was limited to linear 
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Figure 28.3. Neural network simulation analyses (solid line) 
and regression predictions (dashed line) of the response of 
northern bobwhite (Colinus virginianus) counts in the Rolling 
Red Plains of western Oklahoma to mean monthly temperature 
in (a) June, (b) July, and (c) August. Temperature is reported in 
degrees Celsius, and the same scale was used for each plot. 


relationships. Such linear relationships did not exist 
for all variables as indicated by the ANN model. Pre- 
dicted bobwhite counts increased nonlinearly with in- 
creasing June and August mean monthly temperature. 
Predicted bobwhite counts increased with increasing 
June temperature until approximately 30 degrees Cel- 
sius, after which predicted counts decreased (Fig. 
28.3a). Predicted counts also increased with increasing 
August temperature until 34 degrees Celsius, after 
which predicted counts also decreased (Fig. 28.3c). 
The regression model predicted a steadily decreasing 
count with increasing June temperatures, and a 
steadily increasing bobwhite count with increasing 
August temperatures (Figs. 28.3a and 28.3c, respec- 
tively). As July temperature increased, the ANN 


Predicted Bobwhite Count 


Precipitation (cm) 


Figure 28.4. Neural network simulation results (solid line) and 
regression predictions (dashed line) of the response of north- 
ern bobwhite (Colinus virginianus) counts to seasonal precipi- 
tation in the Rolling Red Plains of western Oklahoma. Winter 
months (a) included December, January, and February; spring 
months (b) included March, April, and May; and summer 
months (c) included June, July, and August. Precipitation is re- 
ported in centimeters, but each plot has its own scale. 


model predicted that bobwhite counts decreased non- 
linearly. However, the regression model predicted bob- 
white counts would not respond strongly to July tem- 
perature, although the regression predictions did 
decrease with increasing July temperature (Fig. 28.3b). 

There was a near-linear relationship between win- 
ter precipitation and bobwhite counts as predicted by 
the ANN model (Fig. 28.4a). The regression model 
predicted a positive linear relationship (Fig. 28.42). In- 
creases in winter precipitation were related to in- 
creased bobwhite counts, but counts decreased with 
both spring and summer precipitation (Figs. 28.4b 
and 28.4c, respectively). These predictions matched 
those of the regression model, in that they predicted 
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Figure 28.5. Neural network simulation results (solid line) and 
regression predictions (dashed line) of the response of bob- 
white (Colinus virginianus) counts in the Rolling Red Plains of 
western Oklahoma to (a) the proportion of county area in agri- 
cultura! production, (b) cattle density on nonagricultural lands, 
and (c) the previous year's northern bobwhite count. Cattle 
density is reported as total number of head per square kilome- 
ter of nonagricultural land. 


decreases. However, the ANN model suggested non- 
linearities in the responses. 

Predicted bobwhite counts reached their maximum 
value at midlevels of the proportion of county area in 
agricultural production and the number of bobwhites 
counted during the previous year's survey (Figs 28.5a 
and 28.5c, respectively). The regression model pre- 
dicted little response of bobwhite counts to the pro- 
portion of county area in agriculture, but there was a 
positive trend (Fig. 28.5a). The regression model also 
predicted a linear increase in bobwhite counts with in- 
creasing previous year's counts (Fig. 28.5c). Predicted 
bobwhite counts also increased near-linearly with in- 
creasing cattle density, although the regression model 


showed little effect of cattle density on bobwhite 
counts (Fig. 28.5b). 


Discussion 


The application of ANN modeling techniques to the 
study of ecological phenomena has great potential for 
understanding complex, dynamic processes (Colasanti 
1991; Edwards and Morse 1995; Lek et al. 1996b). 
However, to date, little research has made use of this 
tool. When applied to an ecological research problem, 
ANN models have consistently outperformed tradi- 
tional statistical models (Recknagel et al. 1997; Maier 
et al. 1998). Artificial neural networks have proved 
highly effective in predicting aboveground biomass in 
the tallgrass prairie (Olson and Cochran 1998). Com- 
pared to regression models, ANNs predicted biomass 
and described changes in standing biomass with sub- 
stantially greater accuracy. Heffelfinger et al. (1999) 
used ANNs to accurately predict call counts and age 
ratios for Gambel's quail (Callipepla gambelii) in Ari- 
zona from precipitation and temperature data. Other 
studies have used ANNs to accurately predict trout 
(Salmo trutta) abundance (Baran et al. 1996; Lek et al. 
1996a). Mastrorillo et al. (1997) used a neural model 
to correctly predict the presence of three small-bodied 
fish in freshwater streams in more than 80 percent of 
cases. Ozesmi and Ozesmi (1999) compared ANNs 
with logistic regression to classify locations in a GIS 
database as nest or non-nest sites for red-winged 
blackbirds (Agelaius phoeniceus) and marsh wrens 
(Cistothorus palustris) based on site characteristics. 
Their ANN models outperformed logistic regressions 
in all but one case. The better performance of the 
ANN model resulted because nest-site selection by 
these marsh-nesting species was a nonlinear process. 
For our data set, the regression model performed 
better than the ANN model based on the adjusted 
sum-of-squares criterion. Our neural model also per- 
formed poorly when presented with 1997 data, but 
the weather in 1997 was outside the range of condi- 
tions used to train the model. We have found that the 
magnitude of deviations from long-term mean condi- 
tions may have a greater effect on bobwhite popula- 
tions than yearly weather conditions (Lusk et al. un- 
published manuscript). This may in part be 
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TABLE 28.2. 


Contribution of each independent variable to the artificial neural network and regression 
models’ predictions of northern bobwhite (Colinus virginianus) abundance in the Rolling Red 


Plains of Oklahoma. 


Neural Regression 

network model 

Percent Percent 
Independent variable contribution contribution? t P 
Mean June temperature (C) 402245) 2 -0.75 0.4568 
Mean July temperature (C) T5 1 -0.31 0.7540 
Mean August temperature (C) 16.0 5 057 0.5702 
Winter precipitation (cm) 12.5 54 2.30 0.0245 
Spring precipitation (cm) 7.0 15 -1.47 0.1462 
Summer precipitation (cm) 16.0 9 -1.06 0.2913 
Proportion cropland? 40 3 0.14 0.8928 
Cattle densityc 185 0 0.17 0.8637 
Previous year’s bobwhite count 8.0 dit 157 0.2218 


alndividual R2 expressed as a percentage of the total R2 (0.166) accounted for by the model. 


bProportion of county area in agricultural production. 


Total head per hectare of nonagricultural land. 


responsible for the network’s poor performance in 
1997. However, the additional knowledge gained by 
using the ANN modeling technique is essential for 
successful management. Management and conserva- 
tion decisions based on incomplete or misleading in- 
formation can only harm the species of concern. Sim- 
plicity is only one criterion by which to judge a 
model’s performance. Also important is the ability of 
the model to approximate the process under investiga- 
tion (Burnham and Anderson 1998:23). The ANN 
model provided more biologically meaningful predic- 
tions of responses, because the ANN was able to find 
the nonlinear elements of the responses. We believe 
that the length of the data set may have limited our re- 
sults. The six years for which we have data may not 
have sufficiently captured the response of bobwhites 
to climate variables. Dynamics in semiarid areas are 
characterized by episodic events that require long- 
term data. Model accuracy is a function of sample size 
(Smith 1996:134). Furthermore, with small sample 
sizes, such as those used in our study, the effects of 
noise on the model’s performance are amplified, espe- 
cially if the relationship being modeled is complex 
(Smith 1996:115). Too small of a sample size can re- 
duce the ability of the ANN model to generalize, but 


there are no sample-size restrictions to the application 
of neural networks (Paruelo and Tomasel 1997). 
Using simulations (Lek et al. 1996a, Heffelfinger et 
al. 1999, Ozesmi and Ozesmi 1999), ANN models 
provide information about the effects of the independ- 
ent variables on bobwhite abundance. This not only 
provides a better understanding of bobwhite ecology, 
but also allows us to evaluate the ANN model’s ex- 
planatory ability. June, July, and August temperatures 
were important contributors to the model’s predic- 
tions (Table 28.2); however, August temperature con- 
tributed more than June or July temperatures. The 
higher importance of August temperature may be an 
artifact of counting quail in the fall. Because climate 
conditions can affect the daily activity patterns of bob- 
whites (Roseberry and Klimstra 1984), conditions 
during the roadside counts may have a larger influence 
on the network’s predictions. This influence is the re- 
sult of the more-direct effect of the conditions during 
the count on the count’s outcome. Our model pre- 
dicted that bobwhite abundance would increase with 
June and August temperature, but only to a certain 
temperature, after which counts declined. The increase 
in counts predicted at high June temperatures is prob- 
ably the result of too few data points in that part of 
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the range, making the predictions susceptible to out- 
liers. Had we limited our simulation data set to within 
one standard deviation of the mean, the effects of 
outliers may have been reduced. Predicted bobwhite 
counts decreased with increasing July temperature. 
Summer heat decreased California quail (Callipepla 
californica) chick survival in California (Sumner 
1935). Quail productivity was negatively associated 
with summer temperature in northwest Florida (Mur- 
ray 1958), and July-August temperature was nega- 
tively associated with the length of the nesting season 
and positively associated with nest abandonment in 
southern Illinois (Klimstra and Roseberry 1975). July 
temperature decreased the age ratios of Gambel’s quail 
in Arizona (Heffelfinger et al. 1999). Bobwhites in 
Texas avoided habitat space-time (Guthery 1997) in 
which the operative temperature was more than 39 
degrees Celsius (Forrester et al. 1998). 

Our ANN model indicated a near-linear, positive 
relationship between winter precipitation and pre- 
dicted bobwhite counts. This near-linearity probably 
accounts for the significance of this variable in the re- 
gression model (Table 28.2). Winter precipitation may 
indirectly influence bobwhite abundance through in- 
creased spring vegetation, seed, and insect production 
(Swank and Gallizioli 1954; Sowls 1960). Scaled quail 
(Callipepla squamata) abundance in Texas (Giuliano 
and Lutz 1993) and bobwhite harvest in Illinois (Ed- 
wards 1972) were strongly positively correlated with 
January-March precipitation. Spring and summer pre- 
cipitation had negative curvilinear relationships with 
bobwhite abundance. Among gallinaceous birds, 
young are susceptible to precipitation for the first few 
days of life (Newton 1998) and increased rain early in 
the hatching season may lead to increased juvenile 
mortality (Sumner 1935). Although most studies of 
the effects of spring precipitation on quail abundance 
report a nonsignificant relationship (e.g., Campbell 
1968; Campbell et al. 1973; Heffelfinger et al. 1999), 
spring rain might affect breeding behavior adversely, 
therefore reducing fall abundance. 

Similar to the findings of Roseberry and Sudkamp 
(1998), our model predicted bobwhite abundance to 
be greatest at intermediate levels of agricultural land 
use. As agricultural land increases, initially there may 
not be a net loss of usable space-time for bobwhite. 


Bobwhite abundance at low proportions of agricul- 
tural use may result from an abundance of mid- to 
late-successional habitat, less suitable for bobwhites. 
Similar to the intermediate disturbance hypothesis 
(Connell 1978), intermediate levels of agriculture may 
provide bobwhites with more of the habitat compo- 
nents necessary to support large populations than less 
agriculturally developed lands. Other research has in- 
dicated that bobwhites are associated with patchy het- 
erogeneous landscapes with moderate levels of grass- 
land, row crop, and woody edge (Roseberry and 
Sudkamp 1998). However, as the proportion of agri- 
cultural land increases, there is a net loss of usable 
space-time, any further edge becomes redundant 
(Guthery and Bingham 1992), and quail abundance 
declines. 

Predicted bobwhite counts increased with increas- 
ing cattle density. This is counter to other research 
that indicates grazing negatively influences quail habi- 
tat (Schemnitz 1961). However, Spears et al. (1993) 
found that site productivity governs the seral stage 
most important to bobwhites. Early successional 
stages are favorable for bobwhites on more productive 
sites, whereas late seral stages are favorable on less 
productive sites. Because western Oklahoma is semi- 
arid, and therefore less productive, the positive re- 
sponse we found (Fig. 28.5b) is not consistent with 
expectations. 

Predicted bobwhite abundance showed a weak 
but discernible density-dependent effect in relation to 
the previous year's bobwhite count. For bobwhite 
counts higher than about twenty-five, predicted 
counts for the next fall decreased. The implication of 
this result is that at current levels of habitat space- 
time availability, bobwhite abundances above a cer- 
tain level will adversely affect the population as a 
whole. In other words, the available habitat space- 
time can only support a given number of bobwhites, 
regardless of climate conditions beneficial to bob- 
white increase. 


Conclusions 


We believe ANN modeling techniques offer wildlife 
managers and conservationists with a valuable and 
powerful tool for managing species of concern. 
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Although the ANN model did not outperform the re- 
gression model based on the adjusted sum-of-squares 
criterion, the ANN model did provide a better under- 
standing of how bobwhite abundances in the Rolling 
Red Plains of Oklahoma respond to climate and land- 
use variables. Nonlinear relationships, although wide- 
spread in nature, are often ignored by researchers 
(Gates et al. 1994). The ability of the ANN technique 
to find the nonlinear responses of quail abundance to 
climate variables makes ANN models preferred to tra- 
ditional linear and nonlinear techniques that require 
the specification of the curvilinear response variable. 
A lack of knowledge of the natural history of many 
species makes specification of the correct polynomial 
term a matter of trial and error. 

Model validation indicated that the ANN tech- 
nique was accurate for this region of Oklahoma, but 
the increase in power was only due to the increased 
parameterization of the ANN model. However, use of 
linear modeling techniques may result in a misunder- 
standing of the factors influencing a particular 
process. Our regression analysis was only able to iden- 
tify the linear relationship between winter rain and 
bobwhite abundance. Any management or conserva- 
tion plan must take into account climatic factors if it 
is to be successfully implemented. Furthermore, the 
ANN model we described can continue to. learn as 
more data become available, and can, therefore, be 
used as part of an adaptive management plan (Morri- 
son et al. 1998). Our analysis was limited to a six-year 
data set that may not have represented the entire spec- 
trum of response by bobwhites to climate variables. 
The predictions of the simulation analyses can be used 


to generate hypotheses suitable for empirical testing 
(Recknagel et al. 1997). Simulations also can be used 
to judge the biological realism of the ANN predictions 
and increase the understanding of the factors influenc- 
ing a species' abundance. The use of ANN models also 
can allow more cost-effective management because the 
data used to generate the predictions are readily avail- 
able and cheaply obtained. Our model will be used by 
the Oklahoma Department of Wildlife Conservation 
to estimate bobwhite abundances for the management 
of the fall harvest. A similar modeling effort is under- 
way for Texas Parks and Wildlife Department. We will 
develop a model that will be used by managers in bet- 
ter managing bobwhites in Texas. 
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Incorporating Detection Uncertainty 
into Presence-Absence Surveys for 


Marbled Murrelet 


Howard B. Stauffer, C. John Ralph, and Sherri L. Miller 


here is a long tradition associated with sample 

surveys for presence or absence of flora or fauna 
at sites or stations in sampling units where detectabil- 
ity may be an issué. Species may be present at a sam- 
pling unit yet fail to be detected. Historically, the 
problem with detectability has often been ignored. Re- 
cently, attention has been given to the development of 
survey protocols that increase the likelihood of detec- 
tion. Often, these protocols call for repeated visits to a 
sampling unit. i 

It is a key requirement in the design of an increas- 
ing number of surveys that the numbers of visits to 
sampling unit sites ensure a sufficient level of proba- 
bility, say 95 percent, that species be detected, either in 
an individual sampling unit, or in the entire survey re- 
gion, if they are present. In such instances, the specific 
objective of the survey is to test the null hypothesis Ho 
that species are not present, versus the alternative hy- 
pothesis H4 that species are present. The 95 percent 
probability of at least one detection is the power of 
the survey. 

When counts are taken to estimate abundance, var- 
ious strategies have been developed to address the 
issue of detectability. Capture-recapture methods 
allow the estimation of recapture probabilities for 
both closed and open systems (Otis et al. 1978; Pol- 
lock et al. 1990). Distance sampling allows the estima- 
tion of a detection function to compensate for the loss 


of detectability at increasing distances away from an 
observer in line transect and point transect surveys 
(Buckland et al. 1993). An extensive literature exists 
describing estimators for these methodologies, both 
capture-recapture (e.g., Jolly 1965, 1982; Cormack 
1968, 1979; Nichols et al. 1981; Pollock 1981; Seber 
1982, 1986; White et al. 1982; Burnham et al. 1987) 
and distance sampling (e.g., Burnham et al. 1980). A 
modified version of Emlen’s method also addresses the 
issue of detectability for count response (Ramsey and 
Scott 1981; Scott et al. 1986). 

For presence-absence surveys, Azuma et al. (1990) 
addressed some aspects of this problem using spotted 
owls (Strix occidentalis) as an example. They pro- 
posed a fixed number of visits to each sampling unit 
and a bias adjustment to compensate for false nega- 
tives when estimating the proportion of occupied sam- 
pling units. Link et al. (1994) found that within-site 
sampling variability is a significant portion of overall 
variability in breeding bird surveys, particularly for 
species with low abundance levels. Pendleton (1995) 
recommended two strategies for addressing the effects 
of variation in detectability probabilities in bird point 
count surveys: standardizing surveys, and obtaining 
separate estimates of detection rates and adjusting for 
them. Kendall et al. (1992) commented on the prob- 
lem with detectability in a power analysis study of 
grizzly bears (Ursus arctos), recommending multiple 
strata and optimal timing of surveys to enhance the 
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power of the design. Zielinski and Stauffer (1996) 
were also concerned with detectability probabilities in 
a power analysis for fisher (Martes pennanti) and 
American marten (Martes americana), recommending 
multiple-station sampling units with repeated visits. 
Sargent and Johnson (1997) noted the problem of de- 
tectability with carnivores due to secretive behavior 
and low densities. 

Predictive accuracy assessment of wildlife habitat 
relationship models is dependent upon the quality of 
the response data. Adjustments for the uncertainty of 
detection with response data must be taken into ac- 
count. Young and Hutto (Chapter 8) found that prob- 
lems of detectability can be reduced by using presence- 
absence responses rather than counts in a survey. They 
obtained different results in logistic regression and 
Poisson regression analysis of habitat relationships for 
Swainson’s thrush (Catharus ustulatus) over three suc- 
cessive years, partially due to problems with de- 
tectability. They collected data on ten-point transects 
for the three years to mitigate their uncertainty of de- 
tection. Karl et al. (Chapter 51) examined the effects 
of rarity on the predictive accuracy of habitat relation- 
ship models. They observed that errors of commission 
(species predicted but not detected) are either real or 
apparent. Real errors are caused by species-specific be- 
havior such as the avoidance of humans, cryptic na- 
ture, episodic appearance, or temporal and spatial 
variation. Apparent errors, on the other hand, are 
caused by inefficient or limited sampling where there 
is uncertainty of detection. Reed (1996) discussed the 
influences of detectability in drawing inferences about 
extinction caused by species density, sampling effort, 
habitat structure, visibility, observer bias, number of 
observers, ambient noise, season, and weather. Stauf- 
fer (Chapter 3) cautioned against the use of inade- 
quate data in his historical survey of statistical meth- 
ods applied to wildlife habitat modeling and 
concluded that simple models, using 0-1 response 
data, may work best. Authors do not always explicitly 
address the effect of measurement Type II error caused 
by problems with detectability (species present but not 
observed) in their assessments of wildlife habitat rela- 
tionship model accuracy (e.g., see Conroy and Moore, 
Chapter 16; Elith and Burgman, Chapter 24; Fielding, 
Chapter 21; Henebry and Merchant, Chapter 23; 


Robertsen et al., Chapter 34; Rotenberry et al., Chap- 
ter 22). 

The marbled murrelet (Brachyramphus marmora- 
tus) is a particularly important case in point. Federally 
listed as threatened (USFWS 1997) and listed as en- 
dangered in California, the murrelet is difficult to de- 
tect on land. This species nests in the canopy of trees 
in mature and old-growth forests. Each pair spends 
approximately two months of the April-through- 
September nesting period incubating and feeding one 
nestling, and the rest of the year is spent at sea (Ralph 
et al. 1992). Their flight is rapid and often silent. Fur- 
thermore, their detectability is often affected by visi- 
bility at survey sites (O’Donnell et al. 1995). Estimates 
of detectability at survey stations with occupied be- 
havior (see discussion) in six different redwood stand 
types in California have ranged from 29 to 86 percent, 
with a mean of 59 percent. In individual stands with 
25 or more stations surveyed, estimates of detectabil- 
ity have ranged from 12 to 100 percent (H. B. Stauffer 
personal observations). 

Problems with detectability during repeated pres- 
ence-absence surveys have lacked a statistical model 
structure to describe the distribution of the possible 
survey outcomes for sampling units. It is the objective 
of this chapter to present such a model and describe 
its practical application. The theory will be illustrated 
with its application to marbled murrelet surveys in the 
Pacific coast forests of North America, to an inland 
survey for murrelets in low-abundance areas within 
national forests of California. 


Methods 


Marbled murrelet terrestrial surveys on the Pacific 
Coast of North America follow a standardized proto- 
col developed by the Pacific Seabird Group (Ralph 
and Nelson 1992; Ralph et al. 1994). Sampling units, 
up to 48.6 hectares (120 acres) in size, are surveyed 
for two-hour visits at dawn. Each sampling unit is sur- 
veyed for presence four times each year for two 
years—a total of eight visits. The station-visits are dis- 
tributed over the murrelet nesting season. Observers 
record murrelet activity consisting of visible and audi- 
ble detections of varying nesting and non-nesting 
behaviors. 
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Murrelets are extremely cryptic and individuals are 
not easily distinguished. Although identifiable as mur- 
relets as they fly into a stand, detections cannot be 
readily translated into distinct counts of individuals. 
We have focused our attention on the presence of nest- 
ing or non-nesting behaviors as an alternative measure 
of bird activity. 


An Inland Survey for 
Marbled Murrelets in California 


We are using data from extensive surveys for mur- 
relets conducted by the United States Department of 
Agriculture (USDA) Forest Service, Six Rivers Na- 
tional Forest, in low-abundance inland areas identified 
as Management Zone 2 in California by the Forest 
Ecosystem Management Assessment Team (FEMAT) 
(USDA et al. 1993; Hunter et al. 1998). The primary 
objective of these surveys has been to determine if 
murrelets are present in specified regions. They used 
forest type and geographic location to define habitat 
strata that were surveyed for presence or absence, 
using 48.6-hectare sampling-unit locations. These 
sampling units were visited four times per nesting sea- 
son in each of two consecutive years following the 
guidelines of the standardized marbled murrelet pro- 
tocol (above). It was a critical requirement in the de- 
sign of the survey that sample sizes be sufficient in 
each stratum to ensure a 95 percent probability of at 
least one murrelet detection if they were present in 3 
percent of the area. Thus, the objective of this survey 
was to test, for each stratum, the null hypothesis Ho 
that murrelets were not present versus the alternative 
hypothesis H4 that murrelets were present, with a 
power of 95 percent. They assumed the confidence of 
the survey was 100 percent; in other words, that there 
would be no significant Type I error, or false positives. 


Incorporating Detectability into the 
Binomial Model 


For presence-absence surveys where there is uncer- 
tainty of detection, detectability can be incorporated 
into the binomial model so that options for power and 
sample size can be selected for the survey design. It 
can be incorporated using an adjustment to the proba- 
bility parameter. The binomial distribution B(X;P,n) 


(Cochran 1977; Sarndal et al. 1992; Thompson 1992) 
is described by the probability distribution 


B(X = x;P,n) = ”) .P* -(1- pye* 

where x is the number of sampling units where the 
species is present, P is the probability of presence of 
the species in a sampling unit, and n is the total num- 
ber of units sampled. Note that x can vary between 0 
and n. The model assumes that the total number of 
sampling units in the sampling frame is large com- 
pared to the number sampled, or that the sampling is 
performed with replacement. Otherwise, the probabil- 
ity P of presence would not remain constant as the 
sampling proceeds in a draw-sequential scheme (Sárn- 
dal et al. 1992). The model also assumes complete cer- 
tainty of detection, if the species is present in a sam- 
pling unit. What happens in surveys where complete 
certainty of detection is not the case? We need to de- 
velop an adjusted model that incorporates uncertainty 
of detection into its assumptions. 

An adjusted binomial model B4(X;P,n,p,m) general- 
izes the binomial model B(X;P,n) to incorporate de- 
tectability, using four parameters: P = the probability 
of presence; n = the number of units sampled; p = the 
conditional probability of detection, if present, with 
one visit to a sampling unit; and m = the number of 
visits to the units sampled. The model is described by 
the probability distribution 


MX espe (n (d - py ( Je” (=p 
jx x 

where p’ = 1 — (1 — p)™ describes the conditional prob- 
ability of at least one detection, with m visits to a sam- 
pling unit, if the species is present. This distribution 
describes the probability of x, the number of sampling 
units where the species was present and detected, as 
the sum of the following probabilities: (1) the proba- 
bility of sampling x units with the species present, suc- 
cessfully detecting it all x times (of n total sampling 
units); plus (2) the probability of sampling (x + 1) 
units with the species present, successfully detecting it 
x times and failing to detect it once; plus (3) the prob- 
ability of sampling (x + 2) units with the species pres- 
ent, successfully detecting it x times and failing to 
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; plus (4) the probability of sam- 
pling n units with the species present, successfully de- 


detect it twice; . .. 


tecting it x times and failing to detect it (n-x) times. 
The binomial coefficients count the number of combi- 
nations of such possibilities. Again, note that x can 
vary between 0 and n. 

The B4 model incorporates detectability into the bi- 
nomial model. It assumes that the sampling units and 
visits are independent Bernoulli events. It also assumes 
that the parameters P and p are fixed throughout the 
population. The Bg model is actually a special case of 
a compound binomial-binomial distribution (Johnson 
and Kotz 1969:194, eq. 36). It can be shown directly 
from basic assumptions of the B4 model, or with alge- 
braic simplification (J. A. Baldwin personal communi- 
cation), that Ba(X;Djn,p,m) = B(X;Pp’,n). 


Power for a Sampling Unit: Power nit 


The power of a survey at a single sampling unit, 
POWe€runit, the probability of successfully obtaining at 
least one detection with repeated visits to a sampling 
unit, is given by the formula 


POWETynit = p = Ies p 


where p is the conditional probability of detection, 
with one visit, if the species is present, and m is the 
number of visits to the sampling unit. We assume the 
visits are independent and the conditional probability 
p is constant. 

One can calculate power,,;, by using estimates of 
detectability p, based upon previous surveys, in the 
formula. Alternatively, if estimates are not available, 
one can substitute low values for p and obtain ap- 
proximate lower bounds on power, 

Conversely, one can calculate the number of visits 
necessary to ensure desired levels of powerynpit by sub- 
stituting the prescribed power,,i, and lower bounds 
on p and solving for m in the equation as follows: 


m = log(1 - powerynit)/log(1 — p). 


Power for a Regional Survey: Power,egion 


The power for a target population in an entire survey 
region, POWETyegion, can be calculated as the comple- 


ment of the probability of zero detections in a survey 
of the region, 1 — Bg(0;P,n,p,m), using the Bg model: 


powers = 1 — (1 - P[1 - (1— p)"]] = 1 - (1 - Pp}. 


We are referring here to the presence or absence of 
a target population in a geographical region consisting 
of multiple sampling units, such as a ranger district, 
multiple river drainages, or a national forest, typically 
10,000 hectares or larger. Conversely, sample size n 
can be calculated, if the desired poweryegion and num- 
ber of visits m are specified along with lower bound 
estimates of P and p: 


n = log(1 — powerrgioí)/log(1 — P(1- (1— p)™)) = 
log(1 - poweryegion)/log(1 - Pp’). 


Results 


Below, we summarize our results in three parts: (1) in- 
corporating detectability into the binomial model; (2) 
power for a sampling unit; and (3) power for a re- 
gional survey. 


Incorporating Detectability into 
the Binomial Model 


Figure 29.1 shows the probabilities for the B4 model 
Bg(X;P,n,p,m), contrasted with those of the binomial 
model B(X;P,n), for the case where a small survey of 
ten sampling units (i.e., n = 10) is conducted in a re- 
gion where the species is present in 30 percent of the 
area (P = 30 percent). For the Bg model, we consider 
the case where the conditional probability of detection 
with one visit is p = 30 percent, and there are m = 2, 4, 
and 6 visits to sampling units. 

The three pairs of contrasting bar graphs (Fig. 
29.1) illustrate that for low values of X, the probabil- 
ities that X sampling units, out of the ten sampled, 
would have detections is greater when detectability is 
uncertain. For example, note in the figure that the 
probability of detections at zero of the sampling units 
(X = 0) for the binomial model (white bar) is approx- 
imately 3 percent (i.e., POWEL region = 97 percent), 
whereas the probabilities of X = 0 for the Bg 
model are approximately 19, 8, and 5 percent (i.e., 
DOWerregion = 81, 92, and 95 percent) with m = 2, 4, 
and 6 visits, respectively (black bars). With small 
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B, vs. Binomial Probabilities 


2 visits per site 


Probability (96) of Detections at X Units 


10 


ue." | o: x  R 9 
No. of Sampling Units (X) With Detections 


Figure 29.1. Comparison of the By model B,(X;P,n,p,m) with 
the binomial model B(X;P,n) where X is number of sampling 
units with detections. We surveyed n = 10 sampling units, 
each with a probability of presence P = 30%, conditional prob- 
ability of detection with one visit p = 3096, and m = 2, 4, and 6 
numbers of visits to sampling units. The bars show the proba- 
bility of X sampling units having detections out of ten surveyed. 
For example, with m = 2 visits (top graph), the probability of X 
= 0 sampling units with detections with the By model is 19 per- 
cent (i.e., powerregion = 81%) (black bar) in contrast to 3 per- 
cent (i.e., poWerregion = 97%) for the binomial model with com- 
plete certainty of detection (white bar). Additional visits to the 
sampling units decreases the probability of zero detections to 
8 percent and 5 percent (i.e., powerregion = 92% and 95%), re- 
spectively, for m = 4 and 6 visits (black bars) (middle and bot- 
tom graphs). The probability of X = 1 sampling units with de- 
tections is 34 percent, 22 percent, and 17 percent with m = 2, 
4, and 6 visits, respectively, for the By model in contrast with 
12 percent for the binomial model. 


numbers of visits (e.g., m = 2), the Bg model probabil- 
ities (black) are much higher for lower X values, in 
contrast to the binomial model probabilities (white). 
With fewer numbers of visits, sampling units having 
the species present will be more likely to have zero 
detections; the probability of Type II error of false 
negatives will be greater. As the number of visits in- 
creases, the probabilities in the Bg distribution ap- 
proach those of the binomial model that has complete 
certainty of detection. 

In summary, when detection is uncertain, calcula- 
tions of power based upon the binomial model will be 
misleadingly high. This could result in a greater likeli- 
hood of false negatives (i.e., undetected presence). 


Power for a Sampling Unit: Power nit 


We calculated the power for a sampling unit, 
power ynit, of a detection during at least one visit, with 
increasing numbers of visits m to a sampling unit 
(Table 29.1). Table 29.1 presents a range of levels of 
conditional probability p of detection with one visit: 
10, 30, 50, 70, and 90 percent. Fewer numbers of vis- 
its are necessary to realize a 95 percent power for a 
sampling unit as the conditional probability of detec- 
tion increases. For example, with a 10 percent condi- 
tional probability of detecting the birds in one visit, 
twenty-nine visits are necessary for a 95 percent prob- 
ability of at least one detection at a sampling unit. 
With a 90 percent conditional probability of detec- 
tion, on the other hand, only two visits are required 
for a 95 percent power of successfully detecting 
presence. 

With a 30 percent conditional probability of detec- 
tion with one visit, eight visits will ensure an approxi- 
mate 95 percent power of at least one detection. The 
current marbled murrelet survey protocol, based upon 
eight visits to sampling units (Ralph and Nelson 1992; 
Ralph et al. 1994), ensures an approximate 95 percent 
power for values of p as low as 30 percent. The Pacific 
Seabird Group is currently revisiting the protocol to 
consider revising the number of visits, since some esti- 
mates for p, particularly in low-abundance areas, have 
been falling below the 30 percent threshold. A study is 
in progress to determine if it will be necessary to revise 
the protocol, at least for sampling unit locations in 
some regions. 
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TABLE 29.1. 


TABLE 29.2. 


Power at a sampling unit (power,pit) for presence-absence 
surveys. 


p? (%) m^ Powelunr® (%) 
10 2 19.00 
10 4 34.39 
10 6 46.86 
10 8 56.95 
10 10 6513 
10 do 71.76 
10 14 TEE AL 
10 16 81.47 
10 18 84.99 
10 20 87.84 
10 22 90.15 
10 24 92.02 
10 26 93.45 
10 28 94.77 
10 30 95.76 
30 2 51.00 
30 4 75.99 
30 6 88.24 
30 8 94.24 
30 10 97.18 
50 2 75.00 
50 4 93.75 
50 6 98.44 
70 al 70.00 
70 2 91.00 
70 3 97.30 
90 1 90.00 
90 2 99.00 


ap = conditional probability of detection at the sampling unit, with one 
visit. 

bm = number of visits to the sampling unit. 

Spowelynit = probability of detecting presence at the sampling unit, with 
m visits (= p’). 


Power for a Regional Survey: Power, ion 


We calculated sample sizes required to realize specified 
levels of poWer,egion (95, 90, and 80 percent), for vary- 
ing levels of the probability of presence P (1, 3, 5, and 
10 percent), and varying levels of detectability p (10, 
25, and 50 percent) (Table 29.2). In these calculations, 


Sample sizes (n) for target populations in a region for 
presence-absence surveys. 


Pa (96) p^ (%) m^ Powerregion" (%) n 

1 10 8 95 525 
al 10 8 90 404 
1 10 8 80 282 
fi 25 8 95 332 
al 25 8 90 255 
jl 25 8 80 179 
1 50 8 95 300 
ji 50 8 90 23] 
dL 50 8 80 161 
3 10 8 95 174 
3 10 8 90 134 
3 10 8 80 94 
E 25 8 95 110 
3 25 8 90 85 
3 25 8 80 59 
3 50 8 95 99 
3 50 8 90 76 
3 50 8 80 54 
5 10 8 95 104 
5 10 8 90 80 
5 10 8 80 56 
5 25 8 95 66 
5 25 8 90° 5 
5 25 8 80 35 
5 50 8 95 59 
5 50 8 90 46 
5 50 8 80 32 
10 10 8 95 52 
10 10 8 90 40 
10 10 8 80 28 
10 25 8 95 32 
10 25 8 90 25 
10 2b 8 80 18 
10 50 8 95 29 
10 50 8 90 22 
10 50 8 80 16 


8P = probability of presence. 

bp = conditional probability of detection at a sampling unit, with one 
visit. 

tm = number of visits to a sampling unit. 

dPOWETregion = probability of at least one detection in a region. 


we assumed eight visits per sampling unit, correspon- 
ding to the marbled murrelet protocol. With P = 1 per- 
cent and p = 10 percent, 525 sampling units are neces- 
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sary to realize a 95 percent power for a target popula- 
tion in an entire region. At the other extreme, with the 
more optimistic levels P = 10 percent and 
p = 50 percent, only twenty-nine sampling units are 
required for 95 percent power. 

For the marbled murrelet Zone 2. inland survey in 
California, lower bound estimates of P = 3 percent 
and p = 10 percent were assumed and a sample size of 
174 was necessary to attain a 95 percent power for 


each forest habitat stratum (Hunter et al. 1998). 


Discussion 


In the marbled murrelet protocol, observers note mur- 
relet activity, recording visible and audible detections 
with behaviors assigned to one of three categories: (1) 
occupancy: present and exhibiting nesting behavior; 
(2) presence: present but not exhibiting nesting behav- 
ior; and (3) absence. Although we have referred solely 
to *presence" or *absence" of species, for the mur- 
relet, “occupancy” may be substituted for “presence” 
for the By model, if appropriate to the requirements of 
a particular survey. 


Maximum Likelihood Estimators 


In this chapter, we have focused on the assumptions of 
the By model and its probability distribution. We have 
presented formulas for the calculation of power—for 
sampling units and for target populations in entire 
regions—for presence-absence surveys satisfying the 
assumptions of this model. Such information is useful 
in the determination of sampling design for such 
surveys. 

For the analysis of data collected from presence-ab- 
sence surveys, T. A. Max, J. A. Baldwin, and H. T. 
Schreuder (personal communication) have developed 
closed-form maximum likelihood estimators for P and 
p within a probability parameter space, for marbled 
murrelet and spotted owl survey protocols in the Pa- 
cific Northwest. Their owl estimators assume a proto- 
col whereby the number of visits to sampling units is 
modeled by a negative binomial model: the visits to 
sampling units are ceased once the behavior (i.e., pres- 
ence) has been observed, or a specified maximum 
number of visits has been achieved. Their murrelet es- 
timators alternatively use the assumptions of the Bg 


model, prescribing a fixed number of visits to each 
sampling unit. Their estimators assume fixed P and p 
for a region. 

More general maximum likelihood estimators need 
to be developed, allowing varying P and p for multiple 
regions, years, and seasons. One approach might use 
computer optimization routines to approximate maxi- 
mum likelihood solutions. This context would be 
analogous to capture-recapture estimators, with cap- 
ture-recapture heterogeneity corresponding to varying 
B4 regional P and p, and varying recapture and sur- 
vival estimators corresponding to varying year and 
season P and p. 


Repeated Visits to Sampling Units—its Effect 
on Power for Regional Surveys 


Presence-absence surveys have historically focused on 
locations where species are present at relatively high 
abundance, to determine behavioral and habitat char- 
acteristics. Protocols for such surveys have empha- 
sized repeated visits to sampling units to ensure a high 
degree of power to detect presence in each sampling 
unit. Without repeated visits, the conditional proba- 
bility of detection at specific locations may be low and 
the probability of not detecting the species unaccept- 
ably high. 

In surveys, however, where the primary objective is 
to sample a rare species to determine whether it is pres- 
ent in a region, a more efficient sampling design may 
be quite different. With this objective, it can 
be shown that the power of the survey will be effec- 
tively increased by sampling additional sampling units 
rather than by repeatedly revisiting sampling units that 
have already been sampled, if conditions are reason- 
ably approximated by the assumptions of the Bg 
model. That is, increasing n is more efficient and cost 
effective than increasing m. This observation may be 
surprising at first to surveyors accustomed to existing 
protocols that have emphasized repeated visits to sam- 
pling units. 

The reason for this is that the first visit to a sam- 
pling unit will provide a maximum amount of “infor- 
mation”—more than a second visit. The amount of in- 
formation then decreases with each successive visit. 
Revisiting a sampling unit will indeed increase the 
probability of detection of the species if it is present, 
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but moving on to new sampling units will increase the 
probability even more for detection of the species in 
the entire region. If a sampling unit has been visited 
once and the species was not detected, a second visit 
to that sampling unit will have probability P(1 — p)p 
of detecting the species. A visit to a new sampling 
unit, however, will have probability Pp of detecting 
the species. Since P(1 — p)p < Pp, it is thus the better 
strategy from a statistical point of view to move on to 
new sampling units rather than to revisit old ones. 

We illustrate this effect with an example. If n = 50 
units are sampled in a population with P = 1 percent, 
an increase in sampling intensity from m - 4 to 8 vis- 
its to each sampling unit will raise the power from 
37.6 to 39.4 percent. However, if the sample size is 
raised to n - 100 with m - 4 visits, resulting in an 
equal number of total sampling unit-visits, the power 
of the survey will be increased to 61.0 percent. Costs 
will likely be higher for the latter alternative, to move 
to new sampling units, but even if n = 75 sampling 
units are surveyed with m - 4 visits, the power is 
raised to 50.7 percent. These comparative differences 
will remain generally true for other cases although the 
contrasts will be less extreme where the levels of P are 
higher. 


Variation in P and p and Its Effects on PoWer,ecion 


It is disconcerting that in practical application it may 
not be realistic to make the assumption that P and p 
are fixed, as in the By model. How might variation in 
the probability of presence P and the conditional 
probability of detection p affect the By model and its 
power? Species such as the fisher and the American 
marten (Zielinski and Stauffer 1996) may very well be 
opportunistic and the probabilities p may increase, or 
decrease, with time due to the capabilities of the 
species to adapt their behavior to visiting baited sign 
detection stations. For murrelets, the effective survey 
area of a morning's visit to a 48.6-hectare sampling 
unit is estimated to be approximately 12.2 hectares 
(30 acres). This reflects an observer's ability to hear 
and see murrelet behavior that often includes circling 
in and around the nest area. Therefore, the sampling 
unit cannot be completely surveyed in a morning's 
visit and must be surveyed with repeated visits spread 
over the April-August nesting season and between 


years. It is certainly likely in these cases that P and p 
may vary, geographically, seasonally, and annually. 

Feller (1968:230-231) proves the surprising result 
that the variability of the probability of presence P in 
the binomial model actually decreases the variance of 
its estimator. For the conditional probability of detec- 
tion p, it can be shown, with some elementary proba- 
bility calculations for the Bg model, that if p varies in 
a survey at or above a minimal (assumed) fixed value, 
say pr, then the power of the survey will be at least as 
large as that calculated for the fixed py. In fact, if p 
varies symmetrically around a fixed p,, then the power 
of the survey can be shown to be at least as large as 
that calculated for the fixed py. These results suggest 
that the power of the survey will not be reduced by p 
varying above, or symmetrically around, an assumed 
fixed average p; for a survey; in other words, power 
calculations for regional surveys using the Bg model 
are robust to those types of variation in p. 

Matsumoto (1999) has conducted a sensitivity 
analysis of power estimates for the murrelet protocol, 
applied to regions with low species abundance. Her 
study determined that power estimates are quite ro- 
bust to varying parameter probabilities P and p within 
the investigated ranges. She examined varying P and p, 
assuming low average abundance levels of P = 1, 3, 5, 
and 10 percent, and average levels of conditional 
probability of detection p = 10, 25, and 50 percent. 
Her simulation study examined the effects of varying 
P and p on estimates of poweryegion, based upon the Bg 
model assumptions of fixed P and p. She allowed P 
and p in her simulation to vary, using beta distribu- 
tions with mean values equal to the assumed fixed val- 
ues and with varying standard deviations. Her study 
indicated that power estimates are quite robust to 
varying parameter probabilities for P and p within 
those ranges and beta distributed around assumed 
fixed averages. 


Biological and Sampling Components Affecting 
Presence and Detectability 


Errors of commission (species predicted but not ob- 
served) in wildlife habitat relationship modeling, both | 
real and apparent, affect the predictive accuracy of 
wildlife habitat relationship models. We have focused 
on the statistical aspects of power and sample size se- 
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lection for presence-absence surveys for a species char- 
acterized by an uncertainty of detection. A number of 
biological components affect both detectability p and 
presence P. Real errors are caused by species-specific 
behavior, such as avoidance of humans, cryptic nature, 
episodic appearance, or temporal and spatial variation. 
Such behavior occurring globally throughout the sur- 
vey region affects P, the probability of presence of the 
species. Apparent errors, on the other hand, are caused 
by similar behavior occurring dynamically within sam- 
pling units. This affects p, the conditional probability 
of detection of the species, if present. Other influences 
on apparent error, such as species density, sampling ef- 
fort, habitat structure, visibility, observer bias, number 
of observers, ambient noise, season, and weather affect 
the detectability p. It has been beyond the scope of this 
study to investigate the contribution of each of these 
biological and sampling components to P and p. Future 
investigators are well advised to examine the relative 
effects of each of these contributors to detectability in 
their species surveys. 


Conclusions 


By incorporating uncertainty of detection into survey 
design and analysis, the predictive accuracy of wildlife 
habitat relationship models can be improved. The ad- 
justed binomial Bg model provides a method for incor- 
porating uncertainty of detection into presence- 
absence surveys. The Bg model is useful for both the 


design and analysis of the survey. For the design, it al- 
lows the calculation of the number of visits necessary 
at sampling units to ensure a prescribed power, or 
probability of detection, when the species is present. It 
also allows the calculation of sample sizes and power 
for regional surveys. For the analysis, it provides a 
model for estimating the parameters P, the probability 
of presence, and p, the conditional probability of de- 
tection if the species is present, based upon presence- 
absence data from a survey, using maximum likeli- 
hood. Moreover, although its application has been 
illustrated here for a particularly challenging species, 
the marbled murrelet, it is sufficiently general to be 
applicable to presence-absence surveys of other species 
in sampling units or regions, wherever detectability is 
of concern. 
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Accuracy of Bird Range Maps Based on 
Habitat Maps and Habitat Relationship Models 


Barrett A. Garrison and Thomas Lupo 


M aps illustrating the range or distribution of an 
organism are fundamental sources of natural 
history information used in many biological conserva- 
tion efforts (see Price et al. 1995). In addition, large- 
area wildlife conservation planning efforts such as gap 
analysis (Scott et al. 1993; Davis et al. 1998) use range 
maps as one source of information to predict species 
occurrences and assess biodiversity. Range maps are 
developed many ways (Csuti 1996), including the tra- 
ditional method using manually delineated ranges that 
often result in a few cohesive polygons that represent 
general ranges on relatively small-scale maps (e.g., 
1:5,000,000). Recently, range maps are being devel- 
oped by linking habitat maps with habitat relationship 
models in a geographic information system (GIS). 
Data on species occurrences (e.g., museum records, 
checklists, etc.), topography, and climate also can be 
linked in the GIS as additional data for the maps or to 
test map accuracy. Habitat maps and habitat relation- 
ships models are becoming increasingly available, so 
model-based maps are becoming the standard method 
for mapping species' ranges. 

Model-based maps have several advantages over 
manually drawn maps including (1) automation, be- 
cause maps can be produced by computers; (2) in- 
creased precision, because model-based ranges can be 
more complicated and more biologically based; (3) 
consistency, because habitat relationship models are 


linked with habitat maps so maps of different species 
are based on the same types of information; and 
(4) flexibility, because different models, including 
mechanistic types (Maurer, Chapter 9), and different 
data can be used to develop several maps that are 
verifiable. 

Increased automation, precision, consistency, and 
flexibility, however, may not necessarily result in 
more-accurate range maps because model-based map 
accuracy is highly dependent on the model and data 
used to develop and test maps. Map accuracy must be 
known if conservation efforts are to use these maps, 
because fiscal and logistic resources could be misused 
and incorrect conservation actions could be taken 
based on inaccurate maps. Species distribution and 
habitat maps combined with habitat relationship 
models can predict species composition used to iden- 
tify potential conservation areas (Scott et al. 1993). 
Overpredicting range (commission error) may lead to 
misapplication of species-focused conservation efforts, 
because predicted species may be absent and habitats 
and locations may be overvalued, if species presence is 
a criterion for acquisition or management. However, 
underpredicting range (omission error) also may mis- 
apply conservation efforts due to the species being 
present when predicted absent. Habitats and locations 
may be undervalued with omission errors as predicted 
species richness will be lower than actually occurs. 

Range map error may be due to errors with habitat 
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polygons and other spatial data on which ranges are 
based (e.g., topography, climate) and/or with habitat 
relationships models and locational data (Edwards et 
al. 1996; Krohn 1996; Karl et al., Chapter 51). Fur- 
thermore, differential accuracy and error patterns may 
occur due to species-specific ecological attributes, in- 
cluding aggregation patterns, home-range size, niche 
width, range size, and population abundance, trend, 
or stability (Krohn 1996; Boone 1997; Hepinstall et 
al, Chapter 53; Karl et al., Chapter 51). Accuracy is 
affected by test data that are limited to certain species, 
locations, time periods, or habitats. 

Accuracy of model-based species’ range maps for 
large numbers of species and large geographic areas is 
rarely determined. Tests of model-based maps usually 
have focused on single species (Hollander et al. 1994) 
or on many species from several small areas (Edwards 
et al. 1996; Krohn 1996; Boone 1997). Tests generally 
are not performed to determine how species-specific 
ecological attributes affect map accuracy. Using 
species checklists from relatively small-sized conserva- 
tion areas (e.g., wildlife refuges, national parks), 
Krohn (1996) and Boone (1997) found that map accu- 
racy was affected by some ecological attributes. Be- 
cause large-area conservation efforts are using model- 
based maps, testing should be done over large areas 
(i.e., states and provinces) and should involve many 
species. In this study, we evaluated how ecological at- 
tributes affected accuracy of range maps for one hun- 
dred species of birds that breed in California. In this 
chapter, we attempt to determine whether differential 
error patterns exist and what their causes might be so 
that these patterns can be considered when model- 
based maps are used for wildlife conservation. 


Methods 


We developed range maps for one hundred species of 
breeding birds by randomly selecting them from the 
184 species with population trends reported for Cali- 
fornia from the Breeding Bird Survey (BBS) for 
1966-1996 by the Biological Resources Division of 
the U.S. Geological Service (Sauer et al. 1997). Range 
refers to the mapped area in which the species is pre- 
dicted to occur when breeding. Our maps functionally 
defined the extent of each species’ breeding distribu- 


tion in California using the definition of Morrison and 
Hall (Chapter 2). As of 1994, three hundred and 
twenty-five species of birds were known to breed in 
the California (Small 1994) so BBS data were avail- 
able for 57 percent of the state’s breeding birds. BBS 
methods and data biases are described by Sauer and 
Droege (1990) and Price et al. (1995). 

The habitat map developed for California’s Gap 
Analysis project (Davis et al. 1998) was the basis for 
each species’ range map. Habitat polygons were iden- 
tified using the wildlife habitat classification system 
of the California Wildlife Habitat Relationships 
(CWHR) system (Mayer and Laudenslayer 1988) such 
that “habitat” in this context is a distinct association 
of dominant plant species meeting CWHR classifica- 
tion criteria. This definition of “habitat” differs from 
that of Morrison and Hall (Chapter 2) because we 
worked with mapped and classified polygons. Poly- 
gons had wetland attributes, which we could not use 
because these data were lacking for 35 percent of the 
polygons and for large areas of the state (B. Garrison 
and T. Lupo unpublished data). The habitat map had 
a minimum mapping unit of 100 hectares, and the av- 
erage habitat polygon was 1,930 hectares (Davis et al. 
1998). In a GIS, the habitat map was combined with 
suitability values for breeding habitats predicted by 
CWHR habitat relationships models (Garrison and 
Sernka 1997). Habitats were suitable for breeding if 
the suitability value was rated by CWHR as Low, 
Medium, or High (see Garrison and Sernka 1997 for 
definitions of these ratings). The map was further re- 
fined by retaining habitat polygons that occurred in 
counties where the species was known to breed based 
on county bird checklists provided by the California 
Bird Records Committee (R. Erickson and M. Patten 
unpublished data). Polygons were not “clipped” to 
county boundaries, so there was occasional overlap 
into counties where the species was not known to 
occur. : 

Accuracy Assessment 


Six life-history attributes and three measures of popu- 
lation dynamics (hereafter called ecological attributes). 
were used as independent data to determine if they 
were responsible for possible error patterns in five 
measures of map accuracy. Our analysis followed the 
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general approach discussed by Krohn (1996) and 
Boone (1997) and further refined by Hepinstall et al. 
(Chapter 53). Size (square kilometers) of the species' 
breeding range delineated using the habitat map, habi- 
tat relationship models, and county checklists was 
used as the independent variable for range extent. 
Number of habitats modeled as breeding habitat for 
each species was used as the independent variable for 
niche width. Primary habitat association (terrestrial or 
aquatic) was determined using the major habitat use 
pattern described by Zeiner et al. (1990). Species’ sea- 
sonality (summer or yearlong) was categorized using 
Small (1994). We also determined whether vocaliza- 
tions (songbird) (Ehrlich et al. 1988; Zeiner et al. 
1990) were the primary method of detecting breeding 
individuals (yes or no) on BBS routes. Population ag- 
gregation pattern (territorial or colonial) was catego- 
rized using Ehrlich et al. (1988) and Zeiner et al. 
(1990), and relative abundance, population trend and 
trend P-values for BBS data from California from 
1966 to 1996 (Sauer et al. 1997) were the three meas- 
ures of population dynamics. The number of BBS 
routes per species from which population trends were 
calculated averaged 69.2 plus or minus 46.9 standard 
deviations (range 14-178). 

BBS records from 1977-1996 were used to test map 
accuracy. Start locations of individual 40-kilometer 
BBS routes were point locations for the presence or ab- 
sence of each species. Species were present on a route if 
detected at least twice (two years) during the twenty- 
year sample period, otherwise, the species was absent. 
We chose detections from at least two years per BBS 
route as the minimum because we felt that detections 
for one year out of twenty years was too infrequent to 
represent the species! breeding range and may have an 
unreasonable likelihood of misidentification. 

We used 1:100,000-scale quadrangle maps for Cal- 
ifornia (n = 99) as the test grid to calculate map accu- 
racy. A standard 2x2 error matrix (Congalton 1991) 
was calculated for each species by determining agree- 
ment between quadrangles where the species was pres- 
ent or absent by the range map and present or absent 
from BBS data. The 1:100,000-scale quadrangle maps 
were appropriate for our accuracy assessment given 
the large size of habitat polygons and species' ranges, 
occurrence of the species anywhere along the 40- 


kilometer route, and need to determine error patterns 
for an area as large as California. Larger-scale quad- 
rangle maps (1:24,000, 1:62,500, etc.) have greater 
resolution (Stoms 1992) but bird-occurrence data and 
GIS coverages were not congruent with these scales. 
Furthermore, we were interested in evaluating how 
range map accuracy was affected by species’ ecologi- 
cal attributes, not in measuring absolute error rates. 
The nine ecological attributes are not affected by map 
scale so their effects could be tested with grids of vari- 
ous scales. It would be appropriate, however, to 
reevaluate our results should data (e.g., each stop 
along the 40-kilometer route) become available that 
are more appropriate for analysis using larger-scale 
quadrangle maps or individual habitat polygons. 


Statistical Analysis 


Using the 2x2 error matrix for each species, we calcu- 
lated levels of presence (present by map and BBS), ab- 
sence (absent by map and BBS), and total (presence 
plus absence accuracy) accuracies. We also calculated 
commission (present by map but absent by BBS) and 
omission (absent by map but present by BBS) errors. 
Using backward stepwise multiple regression, the 
three accuracy and two error measures (hereafter 
called accuracy measures) were dependent variables, 
while the nine ecological attributes were independent 
variables. Because of deviations from normality, arc- 
sine (radian degrees) transformations were applied to 
proportion values of the accuracy measures and BBS 
trend P-values. Square-root transformations were ap- 
plied to range size, number of breeding habitats, and 
BBS trend, and log;o transformations were applied to 
BBS abundances (Zar 1996). 

Backward stepwise multiple regression was con- 
ducted using a general linear models procedure (SPSS 
1998). Habitat association, aggregation, songbird, 
and seasonality were categorical variables, while range 
size, number of breeding habitats, BBS trend, BBS rel- 
ative abundance, and BBS trend P-values were contin- 
uous variables. All nine independent variables were 
initially tested singly with the five accuracy measures, 
and independent variables with individual F-test 
results P < 0.05 were removed and entered into the 
model until all remaining variables had P « 0.05. 
Interactions were not tested because of the large 
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TABLE 30.1. 


Values of nine ecological attributes used to test accuracy of breeding range maps for one hundred species of birds in 


California (see text for attribute definitions). 


Measure Mean Median Std. dev. Min. Max. 
Continuous variables 

Range size (km?) 164,348 151,684 TOSNIN S 2,446 404,123 
No. breeding habitats 22.6 22.0 123 4.0 52.0 
BBS trenda 3.6 Mi 9.9 -10.2 31.6 
BBS trend P-values 0.3 0:2 0:3 0.0 4010] 
BBS relative abundance 592 215 LEL On 89.2 


Categorical variables 
Habitat association 


Terrestrial: 80 spp., Aquatic: 20 spp. 


Seasonality Summer: 24 spp., Yearlong: 76 spp. 
Songbird Yes: 68 spp., No: 32 spp. 
Aggregation Territorial: 87 spp., Colonial: 13 spp. 


aPercent change in Breeding Bird Survey population index between 1966 and 1996. 


number of two-way interactions (n = 36) in the initial 
model, small number (n = 1-3) of independent vari- 
ables remaining after the stepwise regression, and 
low levels of multicollinearity between independent 
variables. 


Results 


Most species were territorial songbirds that were year- 
long residents of terrestrial habitats (Table 30.1). 
Breeding range size averaged 164,348 square kilome- 
ters (median = 151,684 square kilometers); the marsh 
wren (Cistothorus palustris) and golden eagle (Aquila 
chrysaetos) had the minimum and maximum ranges, 
respectively. Birds were modeled to breed in an aver- 
age of twenty-three habitats (median = 22), and the 
American white pelican (Pelecanus erythrorhynchos) 
and western gull (Larus occidentalis) bred in the mini- 
mum number of habitats, while the American kestrel 
(Falco sparverius) bred in the maximum number of 
habitats. BBS route indices averaged 3.6 percent 
change (median = 0.9 percent), and most route indices 
were not statistically significant (Table 30.1). 

Mean total accuracy was 65.7 percent (median = 
67.7 percent), and the double-crested cormorant (Pha- 
lacrocorax auritus) and brown-headed cowbird 
(Molothrus ater) had the minimum and maximum 
total accuracy levels, respectively. Presence and ab- 


sence accuracy averaged 49.1 percent (median = 47.5 
percent) and 16.6 percent (median = 10.6 percent), re- 
spectively (Table 30.2). The western gull and brown- 
headed cowbird had the minimum and maximum val- 
ues, respectively, for presence accuracy. Eight species 
and the black-billed magpie (Pica hudsonia) had the 
minimum and maximum values, respectively, for ab- 
sence accuracy. Commission error was the greatest 
source of error, averaging 33.3 percent (median = 29.8 
percent), while omission error was the lowest source 
of error, averaging 1.0 percent (median = 0.0 percent) 
(Table 30.2). The brown-headed cowbird and snowy 
egret (Egretta thula) had the minimum and maximum 
commission errors, respectively. The marsh wren had 
the greatest omission error, while fifty-three of the one 
hundred species had no omission errors. Detections 
averaged 48.6 of the ninety-nine 1:100,000 grid cells 
(median = 47.0, sd = 22.8, minimum = 11, maximum 
= 9150) 1100); 

Regression equations with the best fit explained 
18-63 percent (R2 = 0.175-0.628) of the variance in 
the five accuracy measures (Table 30.3). Presence and 
absence accuracies had 52-63 percent of the variance 
explained by the regressions, while total accuracy and 
commission error had 42-44 percent of the variance. 
explained. Omission error was poorly explained by 
the ecological attributes as 18 percent of the variance 
was explained only by range size. 
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TABLE 30.2. 


Values of five accuracy measures from testing breeding range maps for one hundred species of birds in California 


against Breeding Bird Survey (BBS) detections. 


95% Conf. 
interval 

Measure Mean Median Std. dev. Min. i Max. Lower Upper 
Accuracy (96) 

Total 65m T-T. 16.4 26.3 95.0 62.4 68.9 
Presence 49.1 47.5 29 TET 91.9 44.5 58:7 
Absence 16.6 10.6 ILS} 1.0 Sat 132 20.0 
Error (%) 

Commission SS 29.8 16.6 Sell VET. 30.0 36.6 
Omission IL, 0.0 dts Tf 0.0 alat 0.7 13 


One to three of the nine ecological attributes were 
retained (P < 0.02) by the stepwise regressions. BBS 
abundance was part of the regressions for all accuracy 
measures but omission error. Total and presence accu- 
racies increased with increasing abundance, while ab- 
sence accuracy and commission error decreased with 
decreasing abundance (Fig. 30.1). Presence accuracy 
increased and absence accuracy and omission error 
decreased with increasing range size (Fig. 30.2). Sea- 
sonality, number of breeding habitats, whether the 
species was a songbird or not, BBS trend, and P-value 
of the BBS trend were not retained (P < 0.06) in the 
general linear models for any accuracy measure (Table 
30.3). 

Range size and BBS abundance each had the great- 
est standardized coefficient for two of the four meas- 
ures with more than one attribute in the equation 
(range size was the only attribute for omission error), 
indicating they played the greatest role in determining 
accuracy (Table 30.3). The standardized coefficient for 
range size was greater than the coefficient for BBS 
abundance in the two equations, with both attributes 
indicating that range size played the greatest role in 
determining accuracy for those measures (presence 
and absence accuracy). Habitat association and aggre- 
gation were part of the regression equations for three 
accuracy measures. Total and presence accuracy were 
greater and commission error was lower if the bird 
was a territorial species, and total accuracy was 
greater and absence accuracy and commission error 


was lower if the species associated with terrestrial 


habitats (Table 30.3). 


Discussion 


Model-based range maps were most accurate for 
breeding birds with the following attributes: (1) were 
relatively abundant; (2) had relatively large breeding 
ranges; (3) were territorial; and (4) were associated 
with terrestrial habitats. Species that were less abun- 
dant, were colonial, associated with aquatic habitats, 
and had relatively small ranges had model-based maps 
that were comparatively inaccurate. Colonial species 
associated with aquatic habitats had particularly high 
rates of commission and omission errors. 

Of the nine ecological attributes, BBS abundance 
and range size were the most important variables ex- 
plaining model-based range map accuracy when com- 
pared to BBS occurrences. Habitat association and ag- 
gregation pattern were the next-most important 
variables. Moderate to high levels of variance (42-63 
percent) explained by most regression models indi- 
cated that few ecological attributes are needed to ex- 
plain map accuracy. Low variance for omission error 
(18 percent) explained by the regression model is 
largely due to the narrow range of omission error 
values. 

Prediction error is related to species life history 
(Flather and King 1992; Krohn 1996; Boone 1997; 
Hepinstall et al., Chapter 53; Karl et al., Chapter 51). 


S. 


PREDICTING SPECIES OCCURRENGES 


TABLE 30.3. 


Results of backward stepwise multiple linear regression of effects of nine ecological attributes on model-based 
breeding range map accuracy for one hundred species of birds in California. 


Std. 

Variables R2 Steps Coefficient SE conf. Tolerance df F P-value 
Total accuracy 0.442 6 

BBS abundance 0.247 0.047 0.414 0.939 1,96 27.67 0.000 

Habitat association -0.088 0.023 -0.319 0.838 1,96 14.65 0.000 

Aggregation -0.078 0.027 -0.239 0.886 1, 96 8.68 0.004 
Presence accuracy 0.628 5 

Range size 0.000 0.000 0.575 0.900 1,96 76.98 0.000 

BBS abundance 0.309 0.048 0.405 0.985 1,96 41.71 0.000 

Aggregation -0.068 0.027 -0.161 0.912 1, 96 6.11 0015 
Absence accuracy 01522 6 

Range size 0.000 0.000 -0.791 0.780 1,96 98.06 0.000 

Habitat association -0.107 0.019 -0.469 0.747 1,96 33.00 0.000 

BBS abundance -0.088 0.036 -0.178 0.946 1,96 6.03 0.016 
Commission error 0.423 6 

BBS abundance -0.181 0.040 -0.366 0.939 1,96 20.88 0.000 

Habitat association 0.073 100190321 901885 1,96 14.31 0:001 

Aggregation 0.074 0.022 0.273 0.886 1,96 1100 0001 
Omission error 0m5 8 

Range size 0.000 0.000 -0.418 1.000 1,98 20.80 0.000 


Prediction error is also influenced by model complex- 
ity and scale and data resolution (Karl et al. 2000). 
Species’ life history also influences responses to land- 
scape patterns (Hansen and Urban 1992). Boone 
(1997) categorized species by their likelihood of oc- 
curring on range maps, and he found that range size, 
abundance, and number of habitats used were the 
most important of ten variables explaining species oc- 
currence. We found that abundance, range size, aggre- 
gation, and habitat association were the most impor- 
tant variables explaining map accuracy. We conclude 
that some population and habitat-use attributes have 
little effect on map accuracy because seasonality, pop- 
ulation trend, and habitat niche were not important in 
the regressions. Differences between the two studies 
could be due to the larger ranges of birds in Califor- 


nia, more habitats in California, different bird species 


and sample sizes, test data differences, and inclusion 
of different variables in the regressions. 

Birds that are abundant, have large breeding 
ranges, associate with terrestrial habitats, and are ter- 
ritorial are well suited for automated model-based 
range map development because they are usually rela- 
tively well known and widely distributed. In addition, 
habitat maps with large polygons generally portray 
terrestrial habitats more accurately than they do 
aquatic habitats. Furthermore, territorial birds usually 
vocalize, so they are more easily detected with BBS 
than birds that are colonial or do not vocalize. Terri- 
torial individuals are also more evenly distributed 
along BBS routes than are breeding colonies, which 
tend to occur at lower frequencies and are more local- . 
ized. Moreover, the strong effect of BBS abundance on 
accuracy indicates that map accuracy may be influ- 
enced by the test data because abundant species are 
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Figure 30.1. Scatterplots of BBS abundance (log; transformed) with four measures of breeding range map accuracy (arcsine 
transformed) for 100 species of birds in California. Lines represent linear regressions (P < 0.02) and 95% confidence ellipses. See 


Table 30.3 for regression equations. 


more effectively detected and counted with BBS meth- 
ods (see Sauer and Droege 1990). 

Habitat maps and habitat relationships models gen- 
erally have more types of terrestrial habitats than 
aquatic habitats (see Mayer and Laudenslayer 1988), 
so terrestrial species have more opportunities to have 
suitable habitat polygons placed in their ranges. Our 
results compare well with Boone (1997), who found 
that range map accuracy increased when compared to 
occurrence data from larger areas (i.e., larger poly- 
gons), and accuracy was lower for species that are 


rare, colonial, difficult to observe, and have narrow 
niches and large body sizes. 

Model-based maps overpredicted species’ ranges 
(commission error) by approximately one-third, while 
ranges had very little underprediction because omis- 
sion error averaged 1 percent and over half the species 
had no omission errors. Accuracy may have changed 
had we used smaller grids from larger-scale maps since 
overall map error declines with smaller grids (Boone 
1997). Increases in size of accuracy assessment 
areas decreases commission error and does not affect 
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Presence accuracy (arcsine) 


Omission error (arcsine) 


PD" 


Range size (square root) 


omission error (Garrison et al. 1999). The 1:100,000 
grids are large assessment areas (average - 4,936 
square kilometers, n = 99) so commission error in our 
study is likely due to models, BBS test data, and 
species life history attributes. 

Interaction occurs among model accuracy, species 
abundance, and sample sizes of test data (Karl et al., 
Chapter 51). Commission error, in particular, de- 
creases as sample size increases (Karl et al., Chapter 
51), so our relatively high level of commission error 
(33 percent) may be due to the small number of 
species detections. Our average total error of 34 per- 
cent (range 5—74 percent) was slightly greater than 


ooo odo o 
O -NU & On - o «(oO 


Absence accuracy (arcsine) 


Range size (sq. root (sq. km)) 


Figure 30.2. Scatterplots of range size (square root trans- 
formed) with three measures of breeding range map accuracy 
(arcsine transformed) for 100 species of birds in California. 
Lines represent linear regressions (P « 0.001) and 9596 confi- 
dence ellipses. See Table 30.3 for regression equations. 


that of Boone (1997), who found disagreement (both 
commission and omission error) «1-38 percent be- 
tween forty-seven bird-range maps generated from 
breeding atlases and other published information with 
maps generated using BBS data. 

Based on our results, wildlife managers should be 
careful when using automated model-based range 
maps for conservation. Errors of commission and 
omission increased or decreased depending on several 
ecological attributes, including range size, aggregation, 
vocalization, and habitat association. Higher commis- 
sion error for species with smaller ranges may overem- 
phasize conservation, but species with smaller ranges 
tend to be more rare (Meffe and Carroll 1994) so over- 
prediction errors may be acceptable to wildlife man- 
agers. Species that were colonial and associated with 
aquatic habitats had more commission and omission 
errors, respectively, so conservation efforts may 
overemphasize or underemphasize, respectively, these 
species if model-based range maps are used. We sus- 
pect total and presence accuracy and commission error 
would have been greater for aquatic species had there 
been wetland attributes for all polygons. California’s 
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Gap Analysis Project habitat map has large polygons, 
many terrestrial habitats, and few aquatic habitats, so 
the map's scale is more appropriate for habitats domi- 
nated by trees or shrubs (Davis et al. 1998). 

Wildlife managers can rectify model-based map er- 
rors that we identified in several ways. Focused map- 
ping efforts could be done for aquatic habitats to 
more accurately portray the areal extent and distribu- 
tion of these important wildlife habitats. Habitat 
maps could be used from smaller areas because they 
would likely have smaller mapping units, and aquatic 
habitats might be mapped at a scale commensurate 
with terrestrial habitats. 

Data sources other than BBS might be more appro- 
priate for testing distribution maps for more rare 
species and colonial species. Species lists from conser- 
vation areas may be more appropriate than BBS data 
for developing and testing maps because these lists are 
more thorough and less subjective since most if not all 
species present in an area are typically documented 
(Edwards et al. 1996; Krohn 1996; Boone 1997). 
However, conservation areas may not be distributed 
throughout a species’ range, so parts of the range may 
not be tested. 

BBS data are limited to breeding birds that occur in 
relatively high numbers and can be readily detected 
using point counts (Sauer and Droege 1990; Price et 
al. 1995). BBS routes are systematically located 
throughout the United States along roads such that it 
is difficult to encounter all species. Yet, BBS routes 


were more appropriate for our study because they 
were evenly distributed throughout California com- 
pared to conservation areas, which were primarily lo- 
cated in wetland, coastal, and alpine areas. Moreover, 
BBS routes encompass larger areas than many conser- 
vation areas, and hence BBS data are more spatially 
consistent with California's Gap Analysis Project habi- 
tat map. In states and provinces, sample sizes of BBS 
routes are generally less than that needed for precise 
estimates of prediction accuracy (Karl et al., Chapter 
51). BBS stop data, while not available for our work, 
would have given us much-greater sample sizes. BBS 
data are limited to breeding species, so maps for non- 
breeding species need different data. Lastly, conserva- 
tion efforts should only involve species with ecological 
attributes that are appropriate for habitat relationship 
models, habitat maps, and locational and test data 
used by wildlife managers. Species with ecological at- 
tributes that are not appropriate should not be in- 
cluded in the effort or different models, maps, and 
data should be used. 
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A Monte Carlo Experiment for 
Species Mapping Problems 


Daniel W. McKenney, Lisa A. Venier, Aidan Heerdegen, 


and Mick A. McCarthy 


Me ecological research attempts to describe 
species distributions and abundance in ways 
that can be mapped. Managers need maps or spatial 
predictions to make decisions about activities ranging 
from timber harvest scheduling to nature reserve de- 
sign (Hof 1993). But we will never be able to know 
the entire truth about distributions and abundance be- 
cause we cannot sample everywhere. This is particu- 
larly true in places like Canada and Australia where 
there remains much unstudied territory and relatively 
few biologists. Hence, some form of modeling is re- 
quired to produce inventories or maps. Some re- 
searchers have used interpolation techniques such as 
kriging to describe the patterns of interest (e.g., 
McKenney et al. 1998; Villard and Maurer 1996). 
Others use field observations and statistical relation- 
ships with environmental gradients to make spatial 
predictions (e.g., Venier et al. 1998, 1999; Mackey 
1994). In the latter case, relationships between species 
occurrence and biotic and/or abiotic conditions can be 
mapped using spatial estimates of the independent 
variables. 

Generally a successful derivation of a statistical re- 
lationship is related, among other things, to the size of 
the sample and the strength of the relationships be- 
tween the dependent and independent variables. Vari- 
ous randomization procedures such as bootstrapping 
and jackknifing (see for example Manly 1997) have 


been developed to help assess the veracity of relation- 
ships between dependent and independent variables. 
Use of tools such as these are increasing and clearly 
add to the strength of conclusions drawn from ecolog- 
ical research (Pitt and Kreutzweiser 1998). However, 
applications of these types of tests often focus on pa- 
rameter estimation and the confidence limits around 
these using the actual observation data (e.g., see 
Hilborn and Mangel 1997), not on the resultant 
maps. The robustness of the maps depends on how 
well the observation data sample the environmental 
space in the area being mapped. Our approach is to 
simulate what we will call a “true” map of a species’ 
distribution based on a statistical model. This map is 
compared to predictions from new models based on a 
series of increasing sample sizes. Controlled simula- 
tion experiments such as these are not uncommon 
(e.g., Common and McKenney 1994; Kennedy 1992; 
see also Virtanen et al. 1998). 

In this chapter, we present a Monte Carlo simula- 
tion experiment of the reliability of maps derived from 
logistic regression models as a function of sample size. 
The experiment is set in the context of bird distribu- 
tion models in the Great Lakes Region of North Amer- 
ica where we have been modeling species’ occurrence 
in relation to climate and vegetation gradients (Venier 
et al. 1999). Logistic regression models provide proba- 
bility of occurrence estimates (values between 0 and 1) 
based on a binary (0,1) response variable. Logistic 


Sm 
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regression techniques are widely used in ecological re- 
search where presence/absence data are available (Hos- 
mer and Lemeshow 1989; Collett 1991). 


An Overview of the Experimental Design 


The overall design of the experiment was to predefine a 
distribution of probability of occurrence for a particu- 
lar species (in this case magnolia warbler, Dendroica 
magnolia) as a function of environmental variables 
using logistic regression modeling for the Great Lakes 
region of North America. Our selection and presenta- 
tion of this species is purely for illustrative purposes. 
We converted the probability estimates to binary val- 
ues (0,1) and sampled this presence/absence truth using 
a range of sample sizes. For each sample we recreated 
the logistic regression model and predicted probabili- 
ties of occurrence using the same set of environmental 
variables. For each new model, we compared the true 
or baseline probability of occurrence with the pre- 
dicted distribution of probability of occurrence. The 
experimental design allows us to examine differences 
in maps as a function of sample size. To simplify the 
experiment, we forced the new models to use the same 
environmental variables as the original model. 


Study Area and Background 


Our study area includes the Great Lakes Basin of 
Canada and the United States, an area of approxi- 
mately 2.3 million square kilometers divided into 
roughly 100,000 cells for this study. Breeding Bird 
Atlas data from across the region were compiled 
(Niemi et al. 1998) and used to generate the initial 
model. The model used to define the distribution was 
developed using the approach described in Venier et 
al. (1999). Numerous single and multivariate logistic 
regression models were examined that included a 
range of both climatic and vegetation cover parame- 
ters. The probability of occurrence (p) was described 
as a function of four variables, including two climate 
variables (precipitation during the growing season 
(PGS), and mean annual precipitation (MAP) and two 
land-cover variables (amount of deciduous forest, 
ADF and amount of mixed forest, AMF). The equa- 
tion is: 


logit(p) = In[(p/(1 — p)] = -0.0273 - 0.0221(PGS) + 
0.0119(MAP) + 0.0221(ADF) + 0.0492(AMF). 


The spatial climate data for the region was devel- 
oped using the ANUSPLIN model of M. Hutchinson 
at the Australian National University (e.g., Hutchin- 
son 1987, 1995, 1998). Mathematical surfaces were 
created for 1961-1990 monthly mean maximum and 
minimum temperature and precipitation (McKenney 
et al. 1999). Weather stations (n = 2503) throughout 
the region were used for these surfaces. A number of 
secondary climate variables such as growing-season 
length were derived from these primary surfaces. 
Land-cover variables were derived from AVHRR (Ad- 
vanced Very High Resolution Radiometer) satellite 
land-cover classification available from the United 
States Geological Survey at approximately 1-kilometer 
resolution. The amount of each land-cover type was 
summarized in 5x5-kilometer squares to better match 
the resolution of the Atlas data. Climate estimates 
were taken from the centers of the squares. 


Monte Carlo Protocol 


The distribution for the magnolia warbler (see Fig. 
31.1 in color section) was the probability of occur- 
rence grid for the region generated from the logistic 
regression model noted above. The probabilities were 
converted to binary values (0,1) for the purposes of lo- 
gistic regression modeling using a uniform random 
number generator between zero and one. The cell was 
assigned a value of one (occupied) if the random value 
was below the probability of occurrence and zero (un- 
occupied) otherwise. For example, if the true proba- 
bility of occurrence were 0.80, then the algorithm 
would generate an occupied cell in 80 percent of the 
cases and an unoccupied cell in 20 percent of the 
cases. 

The 0,1 grid was then sampled (without replace- 
ment) for a variety of sample sizes (Table 31.1). This 
particular sampling scheme, which shows the pattern 
of results more clearly, was developed after several 
trials, including multiple runs at a single sample size. . 
For each sample, the logistic regression model was 
developed using the same four variables that were 
used to define the distribution. The logistic regres- 
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Figure 31.2. Mean difference and mean absolute difference 
between true and estimated probabilities of occurrence at 
sample sizes ranging from 10 to 10,000. 


sion equation was then resolved for the region using 
the gridded estimates of the climate and land-cover 
variables thus producing a gridded probability of oc- 
currence map. This predicted probability of occur- 
rence was compared to the “true” probability in a 
cell-by-cell fashion. Mean absolute difference and 
the mean difference between predicted cell values 
and true cell values were calculated. Other diagnos- 
tics such as mean square errors and standard devia- 
tions of absolute differences were calculated but not 
reported here because they do not substantively add 
to the interpretation of the results. In addition, the 
parameter estimates for the environmental variables 
in each model were stored and examined as a func- 
tion of sample size. We repeated this procedure for 
three other models and compared the results for con- 
sistency. Note our focus was comparing predicted 
probabilities rather that comparing the binary val- 
ues. The latter could lead to comparisons of false 


SUD 


TABLE 31.1. 


Sampling protocol used in the Monte Carlo experiment. 


Sample size Number of samples Every N steps 
10-59 4 1 
60-99 2 al 
100-199 1 2 
200-499 all 4 
500-999 al 10 
1,000-4,999 1 25 
5,000-10,000 al 100 
TABLE 31.2. 


Mean absolute differences between the “true” and estimated 
probabilities for sample sizes ranging from 10 to 10,000 for 
the magnolia warbler (Dendroica magnolia). 


Magnolia Species Species Species 
Sample warbler 2 3 4 
size Mean Min Max Mean Mean Mean 
10 0.250 0.08 0.42 0.094 0.175 0.3210 
100 0.060 0.02 0.08 0.044 0.054 0.0675 
200 0.042 0.02 0.07 0.025 0.031 0.0525 
500 0.020 0.01 .0.05 0.014 0.0205 0.0305 
1.000 0.014 0.00 002 01011. 0015 DEO 
10,000 0.007 0.00 0.01 0.002 0.0055 0.0090 


positives and negatives and other diagnostics that 
typically arise in logistic regression modeling. 


Results and Discussion 


The mean of the absolute difference summarizes the 
accuracy of the model over the entire study area (Fig. 
31.2). Each point in the figure represents an individ- 
ual run at the given sample size. The figure shows 
that at the smallest sample sizes (thirty or less), indi- 
vidual models could be very good (mean absolute 
differences almost as good as the highest sample) or 
very poor (mean absolute differences greater than 
0.4). Mean absolute differences (the aggregate value 
for all 100,000 cells) range from 0.08 to 0.42 at 
small sample sizes and fall below 0.01 after one 
thousand observations (Table 31.2). In a real-world 
application, only one sample is available out of an 
infinite number of possibilities. This suggests that 
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Figure 31.3. Intercept and parameter estimates across sample sizes with the true parameter values shown. 


models derived from small samples are often very 
inaccurate and little confidence should be placed in 
maps from such models. Nevertheless, it is difficult 
to assess the biological or management significance 
of the size of these errors in the absence of a deci- 
sion-making context. 

The mean difference (Fig. 31.2) provides an indica- 
tion of the direction of error in single runs. At small 


sample sizes, mean differences were both positive and 

negative, ranging between 0.2 and —0.4. Sample sizes 

over one hundred have mean differences between «0.1 

and -0.1. Beyond one thousand samples, the mean dif- 

ferences have little variation around zero. The distri- 
bution appears roughly symmetrical; however, we 

have not tested for bias. 

Parameter estimates from runs are all centered on 
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the true parameter value (Fig. 31.3). Again, with small 
sample sizes there is much variation around the true 
value. When interpreted in the context of the spatial 
predictions (i.e., Fig. 31.2), small perturbations in the 
parameter estimate can lead to large changes in pre- 
dicted probabilities. The implication of parameter esti- 
mate error in an individual run is difficult to assess a 
priori. These results suggest that ecological interpreta- 
tions of individual parameters could be problematic at 
smaller sample sizes because their signs range from 
positive to negative. Clearly, the closer the true value 
is to zero, the more problematic the interpretation of 
the sign even with large sample sizes. 

Last, we ran the same type of experiment for three 
other species (i.e., model configurations). Results were 
similar (Table 31.2). Mean absolute errors declined 
with increasing sample size and generally did not go 
below 1 percent until the sample size was greater than 
one thousand. 


Conclusion 


We created a Monte Carlo experiment to examine the 
effect of sample sizes on the predictive accuracy of 
broadly scaled species distribution models. We note 
that most researchers would undertake validation tests 
of statistical models of species occurrence by with- 
holding actual observation data (e.g., Pearce and Fer- 
rier 2000). If sufficient data are available, then this is 
clearly the most effective way to test the veracity of 
the statistical models; however, even this approach 
does not deal explicitly with the mapping problem per 
se. The mapping problem is connected to the represen- 
tativeness of the observation data and the nature of 
the environmental space over which the spatial predic- 
tions (maps) are being made. 

For this particular experimental configuration, we 
found that map reliability asymptotes around one 
thousand observations. We reported mean errors and 
mean absolute errors and also summary statistics on 


the parameter estimates of the models. Maps based on 
small sample sizes (less than one hundred) were unsta- 
ble (Karl et al., Chapter 51; Garrison and Lupo, 
Chapter 30; and Elith and Burgman, Chapter 24). In 
fact, parameter estimates varied in their signs even at 
larger sample sizes. Three other models had similar re- 
sults. As with all Monte Carlo experiments, the spe- 
cific results are of course contingent on the formula- 
tion, but we anticipate that they are reasonably 
representative of studies at this scale. 

These results are conservative in that we forced 
new models to use the same set of environmental 
variables as the original model. Future experiments 
will include variable selection problems and explore 
various representations of rarity and nonrandom 
sampling schemes. The ultimate test of reliability 
would be to embed the problem in a decision-making 
framework. Future applications could do so by using 
the spatial predictions in, for example, reserve design 
problems. Given the conservative nature of the ex- 
perimental design, the results would seem to rein- 
force a sense of caution in our ability to map species 
occurrences in the absence of large sample sizes. Ex- 
periments that include other sources of error are 
likely to further decrease our confidence in mapping 
exercises. 

Finally, we note that Monte Carlo experiments are 
valuable tools and could be more widely used for 
species mapping problems. Species mapping applica- 
tions are often derived from small samples and infer- 
ences made across large geographic regions. Monte 
Carlo experiments like these are possibly the only 
approach to gain insights on interactions between 
dependent and independent variables and sample 
sizes. 
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Measuring Prediction Uncertainty in 
Models of Species Distribution 


Jennie L. Pearce, Lisa A. Venier, Simon Ferrier, and 


Daniel W. McKenney 


O e modeling of biological survey data in rela- 
tion to mapped environmental variables is often 
used as a surrogate for direct species distributional in- 
formation. Such modeling can provide cost-effective, 
definitive, and explicit spatial information for use in 
regional conservation planning and management. To 
date, distributional models have generally been de- 
rived by modeling species presence/absence data col- 
lected at field sites in relation to mapped environmen- 
tal predictors, using logistic regression or related 
modeling techniques (Osborne and Tigar 1992; Buck- 
land and Elston 1993; Augustin et al. 1996; Venier et 
al. 1999). These models are then applied to environ- 
mental layers held within a geographic information 
system (GIS) database to extrapolate predicted likeli- 
hood of occurrence across the entire region of interest. 

To use models of species distributions effectively in 
conservation planning, it is important to determine 
their predictive accuracy (Edwards et al. 1996; Boone 
and Krohn 1999). Predictions from such models will 
always contain a level of error resulting from a wide 
range of factors, including insufficient sample size, 
measurement error in the biological survey .data, 
measurement error and insufficient spatial resolution 
in the mapped environmental variables, and failure to 
incorporate critical habitat variables and other factors 
(e.g., predation, competition, dispersal) into the mod- 
eling. Evaluation of the nature and magnitude of pre- 


diction error assists in determining the suitability of 
models for particular applications and in identifying 
specific weaknesses requiring correction. Such evalua- 
tion also facilitates comparison of modeling tech- 
niques and competing models. 

Performance of distributional models may be as- 
sessed at a number of scales (extent and grain). At the 
broadest scale, maps of predicted distribution may be 
examined visually by experts to determine whether the 
predictions provide a reasonable estimate of the range 
limits of each species. This is an important step if 
maps of species distribution are to be used in decision 
making. It is important for all stakeholders involved 
in, or affected by, a planning decision to have confi- 
dence in the broad-scale accuracy of modeled distribu- 
tions. Visual examination of such maps does not, 
however, tell us much about predictive performance at 
finer spatial resolutions, for example at the scale of a 
grid cell. Nor does it tell us much about the reliability 
of predicted probabilities of occurrence or specific 
sources of error in such predictions. More-detailed 
quantitative assessment of the magnitude, nature, and 
potential sources of prediction error requires statisti- 
cal evaluation of the agreement between predictions 
from a model and observations within an independent 
validation data set (i.e., data collected at sites other 
than those used to develop the model). 

Pearce and Ferrier (2000) present a broad frame- 
work for evaluating the fine-scaled performance of 
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models using independent data. They describe ap- 
proaches for addressing two commonly asked ques- 
tions relating to distributional models: (1) does a given 
model provide a good rank index of the occurrence of 
a species across a range of sites for a given applica- 
tion, and (2) does the model provide an accurate esti- 
mate of the probability of detecting the species at each 
site for a given application? Although these two ques- 
tions are similar, they require very different methods 
of accuracy assessment and of addressing the needs of 
different applications. The first question relates to the 
ability of a model to rank areas within a region in 
terms of their likelihood of being occupied by the 
species in question. This information is important if a 
model is to be used to design a reserve system, to lo- 
cate potential survey sites, or to explore relationships 
between habitat characteristics and species occur- 
rence. The methods used to address this question do 
not, however, tell us much about how we might cor- 
rect for specific sources of error within a model. The 
second question concerns the accuracy with which a 
model predicts the probability of a given species oc- 
curring at a site, not just in terms of rank order but 
also in terms of absolute value. This second type of as- 
sessment also provides more detailed information 
about specific sources of error in a model, thereby 
providing guidance as to how the performance of a 
model might be improved. 

Our objective in this chapter is to use the evalua- 
tion framework of Pearce and Ferrier (2000) to assess 
the performance of a selection of distributional mod- 
els developed using logistic regression. These models 
relate the distribution of bird species in the Great 
Lakes region to a number of environmental variables. 


The Models 


We evaluated logistic regression models for ten forest 
songbird species (Table 32.1). These species were se- 
lected because there is a larger overlap of their ranges 
with the Great Lakes region, they use a variety of for- 
est habitat types, and they include common and rare 
species. These models were developed using Breeding 
Bird Atlas data compiled for the Great Lakes Basin as 
part of a project of the Great Lakes Protection Fund 
(NRS 795-2467,, Forest Bird Biodiversity: Indicators 


of Environmental Condition and Change in the Great 
Lakes Watershed). The models relate the presence/ 
absence of breeding birds to broad climate and satel- 
lite-derived land cover using 1,042 selected sites from 
the study area. Models were developed using logistic 
regression. Variables considered for entry into each 
model were chosen to be biologically meaningful to 
each species. Models presented here are those from the 
set of possible models that had the greatest explana- 
tory power within the model development data set 
and that were biologically interpretable. We evaluated 
the predictive performance of these models using data 
from 260 sites withheld for this purpose from the 
model development data set. 


The Evaluation 


As outlined in the introduction, the evaluation frame- 
work of Pearce and Ferrier examines two broad ques- 
tions concerning the predictive accuracy of models. 
These two questions form the basis for the current 
evaluation. 


Does a Model Provide a Good Rank Index of 
Species Occurrence across a Range of Sites? 


The discrimination ability of a logistic regression 
model refers to the capacity of the model to correctly 
discriminate between occupied and unoccupied sites. 
This can be examined graphically by plotting the dis- 
tribution of predicted probabilities associated with the 
occupied sites and the distribution of predicted values 
associated with unoccupied sites. If these two distribu- 
tions are plotted as histograms on the same set of 
axes, then the amount of overlap of the two distribu- 
tions provides an indication of the discrimination abil- 
ity of the model. Discrimination histograms were de- 
rived for each of the ten species models (Fig. 32.1) by 
dividing the predicted probability range (zero to one) 
into ten evenly spaced classes. The histograms for each 
species provide two types of information. First, they 
describe the refinement of the predictions. That is, 
they show the range of predicted values obtained 
within the validation sample. A well-refined model. 
should generate predictions that span the entire zero- 
to-one probability range. Second, the histograms de- 
pict the degree of overlap in predicted probabilities 
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TABLE 32.1. 
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Logistic regression models describing the distributions of forest songbird species within the Great Lakes Basin. 


Common name Scientific name 


Model 


- American redstart 
Bay-breasted warbler 
Black-and white-warbler 


Setophaga ruticilla 
Dendroica castanea 
Mniotilta varia 


Blackburnian warbler Dendroica fusca 


Black-throated Dendroica 
blue warbler caerulescens 
Black-throated green Dendroica virens 
warbler 


Chestnut-sided warbler Dendroica pensylvanica 


Magnolia warbler 
Nashville warbler 


Dendroica magnolia 
Vermivora ruficapilla 


Tennessee warbler Vermivora peregrina 


4.89 — 0.26(X1) - 0.05(X14) — 0.04(X3) + 0.01(X4) - 0.04 (X11) - 0.02(X11) 
-3.05 + 0.01(X7) - 0.02(X10) + 0.01(X8) + 0.015(X6) 

-41.29 + 3.76(X1) — 0.08(X1)2 - 0.29(X3) + 0.005(X3)2 + 0.04(X12) + 
0.05(X14) 

-1.26 + 0.003(X6) - 0.003(X9) - 0.02(X10) + 0.003(X10)2 

-70.6 + 5.24(X1) — 0.12(X1)2 + 0.40(X2) - 0.15(X3) + 0.01(X4) + 0.003(X5) + 
0.003(X6) — 0.01(X10) 

5.30 - 0.18(X1) - 0.22(X2) + 0.003(X5) + 0.007(X6) — 0.00002(X6)? — 
0.016(X10) + 0.00002(X10) 

-78.51 + 6.05(X1) - 0.12(X1)2 + 0.01(X4) - 0.06(X10) + 0.05(X12) - 
0.09(X13) + 0.16(X14) - 0.01(X14)2 

6.83 - 0.48(X1) - 0.01(X10) + 0.51(X2) - 0.06(X3) - 0.01(X10)  0.01(X5) 
10.12 — 0.28(X1) - 0.32(X3) + 0.005(X3)2 + 0.001(X5) + 0.004(X6) + 
0.002(X9) - 0.005(X10) 

-78 + 6.41(X1) - 0.15(X1)2 + 0.85(X2) - 0.06(X3) 


Note: 

X1 Maximum temperature in the hottest quarter 
X2 Mean diurnal temperature range 

X3 Precipitation seasonality 

X4 Precipitation in the hottest quarter 

X5 Hardwood (MSS) 

X6 Conifer-Hardwood Mix (MSS) 

XT Conifer (MSS) 


x9 Grasses or Brush (MSS) 


X10 Agriculture (MSS) 

X10 Dryland, Cropland, and Pasture (AVHRR) 
X11 Woodland/Cropland Mosaic (AVHRR) 
X12 Broadleaf Deciduous Forest (AVHRR) 
X13 Evergreen Coniferous Forest (AVHRR) 
X14 Mixed Forest (AVHRR) 


X8 Bare Ground (MSS) 


associated with occupied and unoccupied sites. These 
two histograms may best be depicted as two distribu- 
tions by plotting the midpoint of each bar and joining 
the midpoints by lines. This is the approach taken in 
Figure 32.1. If a model has good discrimination abil- 
ity, the predicted values for occupied sites will be 
higher on average than those for unoccupied sites. In 
Figure 32.1, the distribution of predicted values for 
occupied sites (shown by the solid line) lies to the right 
of that for unoccupied sites (shown by the dotted line) 
for all ten models, although there is often a consider- 
able degree of overlap between the two distributions. 
The discrimination ability of logistic regression 
models is often quantified by calculating statistics 
from a 2x2 classification table of predictions and ob- 
servations (e.g., Edwards et al. 1996; Boone and 
Krohn 1999; Hepinstall et al., Chapter 53; Karl et al., 
Chapter 51; Schaefer and Krohn, Chapter 36). A 
species is predicted to be present or absent at a site 


based on whether the predicted probability for the site 
is higher or lower than a specified threshold probabil- 
ity. A problem with this measure of discrimination ac- 
curacy is that the measure is sensitive to the location 
of the threshold probability. Different threshold values 
will give very different assessments of model accuracy. 
The choice of an appropriate threshold is difficult, and 
often arbitrary, although there are some guidelines de- 
pending on the intended use of the model (Fielding 
and Bell 1997). 

A more universal accuracy measure should describe 
the accuracy of the system, not just its performance in 
a given scenario (i.e., for a given threshold value). One 
such measure is the area under the relative operating 
characteristic (ROC) curve. An ROC curve is a plot of 
the sensitivity and false positive values obtained by 
considering a large number of threshold probability 
values. For a given threshold, sensitivity is the propor- 
tion of occupied sites correctly classified by the model 
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Figure 32.1. Graphical representation of predictive performance of models for ten species of forest songbirds. The first two graphs 
for each species depict discrimination ability using a discrimination histogram and an ROC curve. The third graph for each species 
is a calibration plot: Scientific names for all species listed can be found in Table 32.1. 
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as occupied. The false positive fraction (or commis- 
sion error) is the proportion of unoccupied sites incor- 
rectly predicted occupied by the model. The area 
under this curve, expressed as a proportion of the area 
yielded by a model with perfect accuracy, provides a 
measure of discrimination ability. An area of 0.5 sug- 
gests that the discrimination ability of a model is 
equivalent to that obtained by a random model (i.e., a 
random assignment of predicted values to sites). As 
discrimination ability improves, the area under the 
curve increases up to a maximum of one. 

ROC curves were developed for each of the ten 
models evaluated, by plotting the sensitivity and false 
positive values obtained for one hundred evenly 
spaced probability thresholds. The thresholds 
spanned the range of predicted values within the 
evaluation data. The ROC plots for each species are 
presented in Figure 32.1. In each graph, discrimina- 
tion performance equivalent to random is indicated 
by the 45? line. As discrimination accuracy increases, 
the curve tends toward the upper lefthand corner of 
the graph. The ROC curves in Figure 32.1 indicate 
fair to good discrimination performance for all ten 
species models. 

An equivalent measure of discrimination ability to 
the area under the ROC curve is the Mann-Whitney- 
Wilcoxon test (Hanley and McNeil 1982). In this 
test, all possible pairs of sites are compared, and the 
ability of the model to accurately rank occupied sites 
as more likely to support the species than unoccu- 
pied sites is evaluated. This measure is easily calcu- 
lated by any statistical package. However, it doesn't 
preclude the need to examine the ROC curve itself, 
especially if two equivalent models are being com- 
pared. The curve provides significant information on 
model performance at any given threshold level and 
describes the relationship between sensitivity and 
false positive values as the threshold level is changed 
(see Pearce and Ferrier 2000 for further discussion of 
the interpretation of the ROC curve). For each 
model, we calculated an index of discrimination ac- 
curacy using the Mann-Whitney-Wilcoxon statistic. 
A standard error for this index was calculated using 
bootstrapping (two hundred samples). As for the 
ROC area, this discrimination index ranges between 
0.5 and 1, with 0.5 indicating discrimination per- 


formance equivalent to a random model and 1 indi- 
cating complete separation between the predictions 
for occupied and unoccupied sites. Pearce and Ferrier 
(2000) provide guidelines for interpreting the 0.5—1 
value range. They suggest that values greater than 
0.9 indicate an excellent level of discrimination be- 
cause the sensitivity values are high relative to the 
false positive values. Values between 0.7 and 0.9 in- 
dicate a reasonable level of discrimination, while val- 
ues between 0.5 and 0.7 indicate poor to marginal 
discrimination ability because the sensitivity rate is 
not much higher than the false positive rate. Based 
on these criteria, models for all ten species show ac- 
ceptable (ROC area greater than 0.7) levels of dis- 
crimination, with discrimination indices in the 0.73- 
to-0.93 range (Table 32.2). Each of these ten models 
therefore provides a good index of species occur- 
rence within the Great Lakes region. 


Does a Model Provide an Accurate Estimate of the 
Probability of Detecting a Species at Each Site? 


The second approach considered by Pearce and Ferrier 
places greater emphasis on the predictions than on the 
observations. That is, the researcher is interested in 
how accurately the model can provide estimates of the 
probability of a species being detected at a site. For 
presence/absence data, this can be examined graphi- 
cally by deriving a calibration plot. This plot is de- 
rived by dividing the predicted probability range into 
classes (in this case, ten evenly spaced classes between 
zero and one) and plotting the proportion of occupied 
sites within each class (y-axis) against the median pre- 
dicted value for the class (x-axis). For a well- 
calibrated model (as shown for the Nashville warbler 
in Fig. 32.1), the points should be distributed along a 
45? line, where the observed proportion of occupied 
sites equals the median predicted value for each class. 
The relationship between model predictions and ob- 
servations can be modeled using logistic regression to 
relate the observed values to the logit of the predic- 
tions. This line can be added to the calibration plot to 
further describe the distribution of the points around 
the 45? line. 

A calibration plot was developed for each of the ten 
species models (Fig. 32.1). The fitted regression line in 
each of these plots follows the 45? line fairly closely 
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TABLE 32.2. 


PREDICTING SPECIES OCCURRENCES 


. Results of statistical evaluation of predictive performance. 


Cox Bias Spread 
Null statistic statistic statistic 
Cox model: deviance Deviance  D(0,1)- Deviance D(0,1) - (a,1) - 
Species? ROC (se) a+b (logitp) (0,1) (a,b) D(a,b) (a,1) D(a,1) D(a,b) 
American redstart 032335 0.083 + 1.104 315.14 314.41 073 
(0.0023) 
Bay-breasted warbler 0.9228 2.329 + 2.587 87.44 76.04 11.40 242.16 -154.72 166.11 
(0.0019) (0.01) (0.01) (0.01) 
Blackburnian warbler 0.7962 -0.044 + 1.054 197.04 196.66 037 
(0.0023) 
Black-throated 0.8454 0.180 + 0.993 157.49 156.63 0.86 
blue warbler (0.0022) 
Black-throated 0.7972 0.473 +1127 241.03 236.30 4.73 
green warbler (0.0019) 
Black-and-white warbler 0.7604 —0.082 + 0.938 268.38 268.13 025 
(0.0021) 
Chestnut-sided warbler 0.7839 0.040 + 0.832 288.05 286.22 1.83 
(0.0021) 
Magnolia warbler 0.8813 0.385 + 1.436 189.07 183.86 521 
(0.0015) 
Nashville warbler 0.7792 0.054+0.990 268.57 268.41 0.16 
(0.0022) 
Tennessee warbler 0.8861 0:.510+441.117 1129.59 127.01 2.58 
(0.0016) 


8See Table 32.1 for scientific names. 


for all species except the bay-breasted warbler (Den- 
droica castanea). This indicates that nine out of the 
ten models are well calibrated. 

To quantify the degree of calibration for each of 
the models, three statistics can be calculated. The 
first statistic, called the Cox statistic, describes 
whether there is systematic departure from the 45? 
line in the calibration plot. Calibration error can 
then be partitioned further into two measurable 
components of systematic error for which statistics 
can be calculated, bias and spread, and a third com- 
ponent, unexplained error. These components can be 
best explained in terms of the regression line fitted to 
observations and predictions in a calibration plot 
(Fig. 32.1). Bias and spread can be thought of as the 
intercept and slope, respectively, of this line. If a 
model is well calibrated, then the points in a calibra- 
tion plot should lie along the 45? line. Bias describes 


a consistent overestimate or underestimate of the 
probability of occurrence, resulting in an upward or 
downward shift in the regression line across the en- 
tire predicted probability range. Spread describes the 
systematic departure of the regression line from a 
gradient of 45 degrees. A gradient between 0 and 1 
implies that the predicted values less than 0.5 are un- 
derestimating the occurrence of the species and that 
predicted values greater than 0.5 are overestimating 
the occurrence of the species. A gradient greater than 
1 indicates that the converse is occurring. 

Bias usually indicates that the prevalence of the 
species in the evaluation sample is greater or less than 
that in the model development sample due to, for ex- 
ample, the use of different survey techniques, or sea- 
sonal variation in the abundance or detectability of 
the species. Bias may also occur when a model devel- 


oped in one region is applied to another region where 
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the species is more or less prevalent. Spread error indi- 
cates misspecification of the model. 

The third component of calibration, unexplained 
error, relates to the variability of individual records 
around the regression line fitted to predicted and ob- 
served values and describes variation not accounted 
for by the bias and spread of a model. Some of this 
variation may arise because particular covariate pat- 
terns or habitat types are not well represented in the 
development of the model. These sources of error may 
be identified through an analysis of residuals. These 
results are not presented here. 

To partition the calibration into its error compo- 
nents, bias and spread, the observations were modeled 
as a function of the logit of the predictions, using lo- 
gistic regression. The regression equation obtained for 
each species is listed in Table 32.2. In this equation, 
the a and b coefficients represent bias and spread re- 
spectively. The significance of bias and spread error in 
the predictions can be examined by calculating the de- 
viance of the regression model fitted to observations 
and predictions, termed the Cox model. This deviance 
value, denoted deviance(a,b), describes the deviance of 
observations from predictions after accounting for any 
systematic bias or spread in the predictions. There- 
fore, if the deviance(a,b) is significantly different from 
the null deviance(0,1) of this Cox model, then predic- 
tions from the original model exhibit a significant level 
of bias or spread error. This change in deviance can be 
compared to a chi-squared distribution with two de- 
grees of freedom. As shown in Table 32.2, the change 
in deviance (Cox statistic) is not significant for all 
models except for that of the bay-breasted warbler. 
Nine of the ten models evaluated therefore generate 
well-calibrated predictions that can be used confi- 
dently at face value as estimated probabilities of 
Occurrence. 

The predictions for the bay-breasted warbler were 
examined further to determine whether the poor cali- 
bration result obtained for this model was due to bias 
error or spread error or both. To this end, bias and 
spread statistics were calculated. The bias statistic is 
calculated as the difference between the deviance(0,1) 
of the original model and the deviance obtained from 
the Cox model when the b coefficient is held at 1, de- 
viance(a,1). This tests the hypothesis that there is sig- 


nificant bias error in model predictions. This change in 
deviance, when compared to a chi-squared distribu- 
tion with one degree of freedom was highly signifi- 
cant. The high value (2.329) for the intercept in the 
Cox model suggests that the predictions are, on aver- 
age, underestimating the occurrence of the species 
within the validation sample. 

The significance of spread error was calculated as 
the difference between the deviance(a,b) of the Cox 
model, and the deviance(a,1) of the modified version 
of this model in which the b coefficient is held at 1. 
This change in deviance tests the hypothesis that there 
is no spread error, given the presence of bias error. 
This statistic was highly significant when compared to 
a chi-squared distribution with one degree of freedom. 
The positive b coefficient in the Cox model suggests 
that the underestimation problem revealed by the test 
for bias is more pronounced for high predicted proba- 
bilities than it is for low predicted probabilities. Pre- 
dictions from this model are not well calibrated and 
therefore cannot be used confidently as estimated 
probabilities of occurrence. However, the evaluation 
of discrimination ability indicated that predictions 
from the model nevertheless perform well as a rank 
index of likelihood of occurrence. 


Conclusion 


Graphical and statistical evaluation of the predictive 
performance of the ten models for the Great Lakes 
Basin suggests that these models are suited to a wide 
range of applications, such as reserve design, explor- 
ing relationships between regional occurrence and en- 
vironmental factors, or locating potential survey sites. 
They all provide predictions that have good to excel- 
lent discrimination ability, thereby providing good 
rank indices of occurrence across the region. All the 
models except that of the bay-breasted warbler are 
also well calibrated. The predictions from these mod- 
els can therefore generally be relied on to provide an 
accurate estimate of the probability of detecting a 
species at a given site. The evaluation has given us 
confidence in using the models for regional conserva- 
tion planning. 

The techniques presented here are readily applied 
using any statistical package that can accommodate 
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TABLE 32.3. 


Summary of evaluation objectives and graphical and statistical tests. 


———————————————————————————————————————————— 


Evaluation objective 1. 


Evaluation objective 2 


Definition The model is to be used to provide a The predictions are to be used at face value. 
rank index of species occurrence. 


Attribute of prediction Discrimination 
quality 


Graphical evaluation 1. Discrimination histogram 
2. ROC curve 
Statistical evaluation Mann Whitney-Wilcoxon test 


1. Calibration 
2. Bias 

3. Spread 
Calibration graph 


1. Cox statistic 
2. Bias measure 
3. Spread measure 


logistic regression analysis. Therefore, the ease of appli- 
cation and interpretation of these evaluation techniques 
suggest that they should be more widely applied to test 
the performance of models of species distribution. Table 
32.3 provides a summary of the evaluation approaches 


described by Pearce and Ferrier (2000) and applied in 
the current study. As a minimum, an evaluation of dis- 
crimination ability should be routinely performed be- 
fore applying distributional models in any conservation 
planning or management exercise. 


CHAPTER 


33 


Toward Better Atlases: 
Improving Presence-Absence Information 


Douglas H. Johnson and Glen A. Sargeant 


a are effective tools for illustrating the spa- 
tial distribution of species (Bibby et al. 1992). A 
typical atlas represents a geographical area as a grid of 
cells. The entry in each cell indicates whether a partic- 
ular species has been recorded in the area represented 
by that cell. Sometimes more than presence or absence 
is indicated, such as the type or strength of evidence 
for presence (e.g., recent versus historic records, or 
museum specimens versus field observations). An ex- 
ample of an atlas for breeding birds (western mead- 
owlark, Sturnella neglecta) in North Dakota is given 
in Figure 33.1a in the color section. Squares denote 
confirmed (e.g., nests or dependent young observed) 
or suspected (e.g., singing male detected) evidence of 
breeding during 1950-1972. 

Atlases serve many purposes. State breeding-bird 
atlases (e.g., Peterson 1995; Jacobs and Wilson 1997), 
for example, contribute information used in field 
guides. Atlases can be used to show habitat affinities 
of birds at a broad scale and can illustrate sympatry or 
allopatry of closely related or competitive species. At- 
lases also distinguish species that have broad ecologi- 
cal tolerances, and hence wide geographical distribu- 
tions, from those with narrow tolerances and 
correspondingly limited distributions. Atlases help 
suggest areas where habitat changes, such as those 
caused by urbanization, can affect certain species. Gap 
analysis projects make effective use of atlases in devel- 


oping layers in geographic information systems and 
identifying areas of high biotic diversity that are not 
protected by, for example, parks and nature reserves 
(Scott et al. 1993). 

Ideally, an atlas would be based on exhaustive sur- 
veys of all habitats in each cell. Rarely is this perfec- 
tion achieved, however, especially for large areas. 
More often, atlases are based on systematic surveys of 
a subsample of cells, on data gathered opportunisti- 
cally, or on some combination of these. In the latter 
two cases, the amount of effort expended by observers 
in cells is likely to vary widely. This disparity in effort 
results in larger numbers of species recorded near 
human population centers, especially university 
towns, than in more remote locations, regardless of 
the true numbers of species that actually occur in each 
type of area. 

Developing an atlas is complicated by variability in 
scale and time (extent and grain) (Fielding, Chapter 
21). Atlases are subject to two kinds of errors: false 
positives (sometimes called commission errors)—cells 
with reported occurrences of a species that does not 
really occupy those cells; and false negatives (omission 
errors)—cells with no indication of a species that re- 
ally is there. False positives are relatively uncommon: 
mistakes made due to misidentification, observations 
of escaped animals, or individuals blown off course 
during migration, for example. False negatives are 
much more likely to occur. Most such errors reflect 
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insufficient effort by observers, especially for species 
that are rare, secretive, or occur only sporadically in 
some cells. 

Recent advances in three fields facilitate the im- 
provement of atlases. First, a considerable amount of 
effort has gone into developing models to predict 
species occurrences from habitat information (e.g., 
Verner et al. 1986a,b) and other explanatory vari- 
ables, such as climatic features (e.g., Nicholls 1989; 
Walker 1990). By incorporating spatial information as 
well as other explanatory variables, such models can 
be useful for improving atlases (e.g., Le Duc et al. 
1992; Buckland and Elston 1993; Smith 1994; Hóg- 
mander and Moller 1995; Augustin et al. 1996). Sec- 
ond, in image processing, observed images are some- 
times treated as degraded views of true images. 
Methods of image reconstruction have been developed 
(e.g., Besag 1986; Besag et al. 1991); these can be 
adapted to atlas mapping by considering an observed 
atlas to be an imperfect representation of the true dis- 
tribution of a species (Heikkinen and Hógmander 
1994). Third, the widespread availability of powerful 
computers has permitted more-intensive statistical 
analyses, including methods such as Markov chain 
Monte Carlo (Geman and Geman 1984; Gelfand et al. 
1990; Gelfand and Smith 1990; Besag et al. 1995). 
Some of these methods are complex, however, and 
currently not readily accessible to many biologists. 

Our objective is to demonstrate some relatively 
simple and intuitive smoothing methods that can be 
applied to observed atlases to improve their accuracy. 
We use presence/absence information from nearby 
cells, as well as auxiliary data, such as habitat infor- 
mation, from each target cell. We propose a proxy for 
the effort expended by observers in a cell and incorpo- 
rate that variable as well. 

We illustrate the methods on both real and artificial 
data. The real data are breeding-bird atlases for west- 
ern meadowlark (Fig. 33.1a), Baird’s sparrow (Am- 
modramus bairdii, see Fig. 33.2a in color section), and 
blue jay (Cyanocitta cristata, see Fig. 33.3a in color 
section) in North Dakota (Stewart 1975). Cells in the 
atlas grids are legal townships, each about 6 miles 
square (total area is 9,324 hectares). Artificial data 
(see Figs. 33.4a, 33.5a in color section) were generated 
on a similar grid to exemplify a species—like the blue 


jay, which is associated with woodland—that has a 
complex distribution that depends on a particular 
habitat, which is known and mapped. Two artificial 
distributions—one of a highly detectable species, an- 
other of a less-detectable species—are treated as 
known (Appendix). Our methods are applied to sub- 
samples drawn from the distributions to demonstrate 
improvements in accuracy, that is, to validate the 
methods. 

We were able to test the predictions generated by 
the methods developed here with an independent data 
set. Browder (1998) conducted point counts of birds 
at 885 points in 1995 and/or 1996. Points were clus- 
tered into forty-four roadside routes. Her methods 
were analogous to those used in the North American 
Breeding Bird Survey (Robbins et al. 1986). 


Methods 


The original data (Stewart 1975) are mapped on a 
grid of cells. For each cell, the indicator variable I as- 
sumes a value of 1 if the particular species was 
recorded there and 0 if it was not. We wish to estimate 
for each cell the probability (J) that the species actu- 
ally occurs there. Thus, each entry will be a value be- 
tween 0 and 1. 


Simple Spatial Smoothing 


The first model we describe simply replaces I in each 
cell with I = 0 with a weighted average of values of I 
from adjacent and diagonal townships. Weights (w) 
are inverses of mean distances (d) between the target 
cell and neighboring cells, standardized to sum to 1. 
This method of smoothing is illustrated for a single 
target cell and its eight neighbors or near-neighbors in 
Figure 33.6 (see color section). It produces a value in 
the target cell ranging from J = 0 (if the species was 
not recorded in any neighboring cell) to J = 0.84 (if 
the species was recorded in every neighboring cell but 
not in the target cell). 


Habitat Commonality 


The assumption underlying the simple spatial smooth- 
ing method, that a species probably occurs in a target 
cell if it occurs in most neighboring cells, may not be 
true if the target cell is lacking essential habitat that is 
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TABLE 33.1. 


Weights used in the habitat commonality smoothing method, 
based on the presence or absence of the species in a 
neighbor cell and on the presence or absence of suitable 
habitat in the target cell and the neighbor cell. 


Species present in neighbor cell 


Suitable habitat in 


neighbor cell 
Suitable habitat in target cell Yes No 
Yes al 1a 
No (0) ili 

Species absent in neighbor cell 
Suitable habitat in 

neighbor cell 
Suitable habitat in target cell Yes No 
Yes al 0 
No Ta al 


aDenotes nonintuitive weights (weights of 1 even though cells do not 
share the required habitat). The first indicates that a species is 
present in a neighbor cell that lacks appropriate habitat, so it is 
reasonable to deduce that the species might be present in the target 
cell, which does have the habitat. This assumes an error either in the 
habitat database or in the assumption that the habitat actually is 
required. The second says that the species is absent from a 
neighboring cell with the requisite habitat, so it is reasonable to 
assume it will be absent from the target cell, which does not have 
suitable habitat. 


available in neighboring cells. We incorporated this 
concept by using habitat weights (Table 33.1) and 
multiplying the initial simple smoothing weights (w) 
by the habitat weights before standardizing them to 
sum to 1. Habitat weights are 1 if the target cell and 
the neighbor cell both have the habitat required by the 
species. They are also 1 if the required habitat is avail- 
able in the target cell and the species is present in a 
neighbor cell without that habitat; or if the target cell 
lacks the requisite habitat and the species is absent in 
a neighboring cell with the habitat. This forms the 
basis for the second model. 

The three species we illustrate have different habi- 
tat affinities. The western meadowlark is a grassland 
generalist, which uses a variety of short to tall, native 
or introduced, grassy to brushy habitats (Dechant et 
al. 19992). Baird's sparrow is a grassland specialist; its 
favored habitat for breeding is grasslands of medium 
height, usually composed of native plant species, with 
little woody vegetation, and preferably in large 


patches (Dechant et al. 1999b). The blue jay is a 
woodland edge species, rarely found in open habitats 
(Stewart 1975). 

In North Dakota, suitable habitat for the western 
meadowlark exists in virtually every cell (township), 
so the habitat commonality method will offer no im- 
provement over simple spatial smoothing. And, al- 
though Baird's sparrow is more restricted in its selec- 
tion of grassland, there is no extant database that 
would outline suitable habitat for the species. For the 
blue jay, a database that provided useful information 
about one component of its favored habitat (forest- 
land) was the land-use and land-cover maps (Ander- 
son et al. 1976) developed by the USGS (1986) from 
color-infrared aerial photography. 


Search Effort 


With the third model, we account for variable search 
efforts among cells. For instance, we would not want 
to “borrow” information about a species being absent 
in a neighboring cell if very little observation had 
taken place there. Because the effort expended by ob- 
servers in each cell was not recorded (cf. Osborne and 
Tigar 1992), we developed a proxy for it based on the 
number of species recorded. We used the double-log 
function, 


log(log(Nspecies) + 1) - 1 


- 33.1 
log(log(max(Nspecies) + 1) + 1) | ) 


where E is a proxy for effort and Nspecies is the num- 
ber of species recorded in a cell. This function incorpo- 
rates the notion that cells with greater effort will have 
more species reported, but that, once the effort becomes 
substantial, new species will be added only slowly. Also 
note that this function estimates effort to be zero for 
cells with no species observed; we believe that, in our 
example, all townships in North Dakota have breeding 
birds, which would have been detected with even the 
slightest effort. In our North Dakota examples, max 
(Nspecies) = 124. E attains the value of 0.30 with but a 
single species recorded, reaches 0.60 with six species, 
and reaches 0.90 with forty-eight species. 

This proxy for effort is based on the implicit as- 
sumption that each cell has the same number of 
species that could be detected. Although this assump- 
tion is no doubt false in general, it is likely to be 
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approximately true for cells that are nearby, especially 
if they share similar habitats. 


Testing the Models 


For use here, we aggregated the points at which Brow- 
der (1998) surveyed birds into the legal townships that 
contained the points. The 885 were ultimately 
grouped into ninety-five legal townships, with the 
number of points per township ranging from one to 
twenty-three (mean = 9.3). We then determined for 
each township whether or not each species was de- 
tected on any point count in either year. We compared 
resulting presence/absence values to the values pre- 
dicted from our model as well as to the presence/ab- 
sence values in the original atlas. For the meadowlark 
and Baird's sparrow, we used the models that ac- 
counted for search effort; for the blue jay, we used the 
model that incorporated habitat information as well 
as search effort. 

To evaluate the models, we computed the sum of 
squared errors as well as the sum of absolute values of 
the errors. We additionally conducted logistic regression 
analyses with observed presence/absence (Browder's re- 
sults) as the binary response variable. Explanatory vari- 
ables were predicted values from each model and from 
the presence/absence values in the original atlas. 


Results 


Spatial smoothing basically *spread" the information 
on occurrences in a cell into neighbor cells, or, from 
another point of view, each target cell *borrowed" in- 
formation from neighbor cells. For the ubiquitous 
western meadowlark, spatial smoothing increased the 
number of cells with positive probabilities of occur- 
rence (Fig. 33.1b), which is an entirely reasonable out- 
come. Similar changes occurred for the Baird's spar- 
row and blue jay (Figs. 33.2b, 33.3b), but we cannot 
be sure that those changes represent actual improve- 
ments in the atlases. 

For the artificial data, for which we know the true 
distribution, simple spatial smoothing tended to fill in 
gaps in the distribution that resulted from insufficient 
effort, leading to improved atlases (Figs. 33.4b, 
33.5b). Offsetting that gain in accuracy was incorrect 
expansion of the range caused by smoothing informa- 


tion from cells at the border of the range into cells 
outside the range. Nonetheless, the overall error rate 
for the high-detectability situation decreased from 22 
percent for the actual data to 7 percent after simple 
spatial smoothing. With low detectability, the error 
rate improved from 28 to 13 percent. 


Habitat Commonality 


We had useful habitat information only for the blue 
jay. By incorporating habitat availability in the 
weighting, information was not simply spread out; 
such spreading occurred only if suitable habitat was 
available (Fig. 33.3c). Compare, for example, the top 
center of the state in Figures 33.3b and 33.3c. 

With the artificial data, incorporating habitat infor- 
mation (knowing there was no suitable habitat outside 
of the occupied range) enhanced the atlases (Figs. 
33.4c, 33.5c). The improved accuracy was primarily 
due to the trimming of the smoothed atlases outside 
the boundary. Error rates decreased from 22 to 2 per- 
cent for the high-detectability scenario and from 28 to 
11 percent for the low-detectability situation. 


Search Effort 


Adjusting for survey effort improved the atlas for the 
western meadowlark (Fig. 33.1c). The atlas has be- 
come nearly completely filled in, as we believe is ap- 
propriate for this species. We suspect that the Baird's 
sparrow results (Fig. 33.2c) are likewise improved, but 
because we lack knowledge of the true distribution, 
we cannot be sure. Adding information about search 
effort made little apparent difference for the blue jay 
atlas (Fig. 33.3d). 

For the artificial data, the additional information 
about search effort did not improve error rates beyond 
what the habitat commonality method did (Figs. 
33.4d, 33.5d); error rates remained at 2 and 11 per- 
cent for the high- and low-detectability scenarios, re- 
spectively. Nonetheless, the extra step did reduce the 
apparent patchiness of the resulting atlas, which was 
an appropriate change. 


Testing the Model 


The western meadowlark is ubiquitous in North 
Dakota. A perfect atlas would have every cell filled. 


33. Toward Better Atlases: Improving Presence-Absence Information 395 


Browder recorded the species in 83.2 percent of the 
cells she surveyed (Table 33.2). In the original atlas, 
only 69.5 percent of those cells were filled. Our model 
produced an average value across all cells of 79.9 per- 
cent, closer to Browder's result and to the presumed 
true value of 100 percent. It could not attain that high 
a value because values smoothed from adjacent cells 
are less than 1. Clearly though, the model indicates 
that western meadowlarks likely occur in each cell 
(Fig. 33.1). Both the sum of squared errors and sum of 
absolute errors were smaller for our model values than 
for original atlas data (Table 33.2), suggesting the su- 
periority of the model values. The logistic regression 
indicated that neither the original atlas nor our mod- 
eled value was useful in predicting where Browder 
would find the species; that result is consistent with 
the species' ubiquity. 

Baird's sparrows are most common in the western 
two-thirds of North Dakota. They are erratic in their 
occurrence in most locations, however, and are easily 
overlooked. Browder detected Baird’s sparrows on six 
(6.3 percent) of the ninety-five townships she visited. 
Only two of those townships had records in the origi- 
nal atlas data, suggesting that the atlas greatly under- 
estimates the distribution of that species. Baird's spar- 
rows were recorded in 12.6 percent of the cells in the 
original atlas (Table 33.2), lower than the average 
model value of 20.0 percent. The sum of squared er- 
rors and sum of absolute errors gave conflicting results 
(Table 33.2). Logistic regression indicated that our 
model values were useful in predicting where Browder 
would find Baird's sparrows (P = 0.035), even after ac- 
counting for the effects of the original atlas value (P = 
0.10). 

Blue jays are distributed widely throughout North 
Dakota where woodland occurs. Their numbers dou- 
bled between 1967, near the time when the original 
atlas was completed, and 1992-1993, near the time of 
Browder's survey (Igl and Johnson 1997). The increase 
is likely due at least in part to increases in woody veg- 
etation such as shelterbelts (L. D. Igl and D. H. John- 
son unpublished data) and possibly to increased feed- 
ing of birds, especially in winter. The sum of squared 
errors and sum of absolute errors were inconsistent 
(Table 33.2). Browder recorded the species in 20.0 
percent of the cells she visited, whereas only 10.5 per- 


TABLE 33.2. 


Criteria comparing predictive ability of original atlas and atlas 
developed from models described in the text, when applied to 
an independent data set (Browder 1998). 


Western Baird's 

meadowlark sparrow Blue jay 
Sum of squared errors 
Original atlas 33.0 14.0 17.0 
Model values 152 10.9 132 
Sum of absolute errors 
Original atlas 33.0 14.0 17.0 
Mode! values PU 19.7 20.1 
Average value 
Browder 0.832 0.063 0.200 
Original atlas 0.695 0.126 0.105 
Model values 0.799 0.200 0.114 
Logistic regression deviance 
Original atlas 0.43 1.88 8.892 
Model values 0.56 4.43b 14.40€ 
Model values/original 0.18 2.704 5.63b 
Original/model values 0.05 0.16 0.10 
aP < 0.01 
bP < 0.05 
cP < 0.001 
dP < 0.10 


cent of cells were filled in the original atlas. Our 
model improved that only to 11.4 percent, likely be- 
cause it was based on the original data, which do not 
reflect recent increases in the number and distribution 
of blue jays. Both the original atlas value and our 
model value were useful predictors of where Browder 
would find blue jays (P « 0.003). Our model values 
were useful even after atlas values were included in the 
logistic regression (P = 0.018); the converse did not 
hold true (P = 0.75). 


Discussion 


We have demonstrated that relatively simple computa- 
tional methods can markedly improve the accuracy of 
atlases. Without any additional information, simple 
spatial smoothing can improve an atlas, especially for a 
species with a widespread distribution. Knowledge of 
the habitat affinities of a species, in combination with a 
spatial representation of the availability of that habitat, 
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can further improve the accuracy (as with the blue jay). 
Improvements may be greater for species with narrow 
ecological requirements, especially if the spatial distri- 
bution of those requirements is known. Finally, knowl- 
edge of the amount of effort expended in each cell can 
be exploited to improve atlases. Such knowledge may 
not be available, but the proxy we offered for effort 
appeared to work well in our examples. 

Browder's (1998) data provided a useful test of the 
models developed here. The model for western mead- 
owlark better represented the ubiquity of that species 
in North Dakota than did the original atlas. For the 
Baird's sparrow and blue jay, model values were better 
predictors than were the original data in logistic 
regression. 

Data from the examples we presented (Stewart 
1975) were not gathered according to any particular 
design. The resulting atlases, even for ubiquitous 
species such as the western meadowlark, have fairly 
large *holes" in them. The methods we presented can- 
not satisfactorily fill such holes. Moreover, they prob- 
ably should not, for such an outcome would be based 
far more on assumptions that underlie the analytic 
method than on the data themselves. All statistical re- 
sults are a product of, in various combinations, the 
data employed and the assumptions on which the ana- 
lytic method is based. More sophisticated methods 
yield more definitive results, but only if the underlying 
assumptions are reasonable. 

The design of an atlas project merits serious consid- 
eration. The optimal design will depend not only on 
the intended uses for the resulting atlas, but also on 
the analytic method to be used on the results. If 
smoothing procedures are not to be used, it is likely 
that the best design for gaining an overall impression 
of the distributions of the various species would in- 
volve samples spaced regularly throughout the grid. If, 
as in our examples, there is an extant database of op- 
portunistically gathered presence-absence data, an op- 
timal design for enhancing an atlas might involve 
drawing a new sample with the probability of includ- 
ing a cell based on the inverse of the effort already ex- 
pended in that cell (or the inverse of the proxy we 
described). 

Improved mappings of habitat availability could 
dramatically increase the value of the habitat common- 


ality method we presented. Such maps are becoming 
more widely accessible with improved remote-sensing 
capabilities and efforts such as gap analysis. Also likely 
to be helpful is new information or better syntheses of 
information on the habitat affinities of species. 

Other improvements might result from using other 
kinds of auxiliary information. One avenue we intend 
to explore is the possibility of using information about 
certain species to predict the presence or absence of 
other species. A particularly promising application of 
that approach would involve the use of information 
from a highly detectable species to predict the occur- 
rence of a more secretive species that has very similar 
habitat requirements. 

Atlases are becoming increasingly common, not 
only for birds but also for other vertebrates, inverte- 
brates, and plants. Improved computer graphics capa- ` 
bilities facilitate the economical and widespread dis- 
semination of full-color maps and other visual 
products. The World Wide Web offers great potential 
not only to display atlases, but also to encourage indi- 
viduals to provide information to improve them. Con- 
siderable effort often goes into gathering the base 
data, verifying and processing them, and presenting 
the results. As important as the final products are, it 
behooves us to develop and use the best analytical 
methods for them. 
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Appendix: Generation of Artificial Data 


We considered two species, one with high detectability 
and one with low detectability. We assumed that for 
the maximum effort expended in any cell, the proba- 
bility (R) of recording the species was R = 0.50 for the 
highly detectable species and R = 0.20 for the less-de- 
tectable species. 
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We defined a grid of sixty-five by thirty-five cells, 
about the size of North Dakota used in the other ex- 
amples. Within that grid, we assumed suitable habitat 
existed, and the species actually was present, in the 
cross-shaped region outlined in Figure 33.4. The area 
outside that region was deemed unsuitable for and un- 
occupied by the species. 

To develop effort values for each cell, we used our 
proxy based on the number of species (Nspecies) 
recorded in the cell. We used a beta distribution (with 
exponents & = 1.2 and B = 5) to approximate the dis- 
tribution of Nspecies recorded in the townships of 
North Dakota. For each cell in the grid, we randomly 
selected a value from that beta distribution, multiplied 
it by 125, and rounded down to next lowest integer to 
get a value for Nspecies. That value was used in Equa- 
tion 33.1 to calculate the effort proxy, E. That value 
was multiplied by the detectability value to obtain a 
probability of recording the species (P) for each cell: 


P=RXE 


We next drew a random variate from a binomial dis- 
tribution with probability P. If the outcome was a 
“success,” we recorded a detection for that cell (J = 1); 
otherwise, the species was not listed as recorded (I = 
0). That set of values for the entire grid represented 
the observed or raw data, which we wished to smooth 
using the methods we present here. 

To obtain effort data for using with the search ef- 
fort method, we drew a random number (Nspecies’) 
from a Poisson distribution with parameter Nspecies 
obtained above. This step added random noise, so that 
we did not use the same value of effort to smooth the 
data as we used to create them. 

We then applied the various smoothing methods, 
using the I and Nspecies’ values generated above. We 
obtained two grids, one for a highly detectable species, 
another for a less-detectable species. 
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Predicting the Distributions of Songbirds in 
Forests of Central Wisconsin 


Margaret J. Robertsen, Stanley A. Temple, and Jobn Coleman 


NI studies have demonstrated the importance 
of fine-grained habitat features, typically 
within plots the size of a bird's territory, on predicting 
the distribution of songbirds (Cody 1985a,b; Block 
and Brennan 1993). Others have considered the im- 
portance of coarse-scale variables, measured over 
landscapes much larger than a territory or home range 
(Shriner et al., Chapter 47; Boulinier et al. 1997; 
Lauga and Joachim 1992; Gustafson and Crow 1994; 
Pearson 1993). Habitat selection is believed to be a 
multiscale hierarchical phenomenon spanning selec- 
tion of a geographic range (first-order selection) to 
patterns of habitat utilization (third- or fourth-order 
selection) (Johnson 1980). Combining fine- and 
coarse-grained levels of habitat selection requires a 
multilevel approach. 

Geographic information systems (GIS) can be used 
to draw relationships between bird species and envi- 
ronmental variables and map distributions across 
large geographic areas. Palmeirim (1988) used a GIS 
to display the predicted distribution of songbirds by 
extracting cover-type information from Landsat satel- 
lite imagery (Thematic Mapper) and by incorporating 
rules on patch size requirements. Green et al. (1987) 
combined bird survey data with Landsat land-cover 
information to predict the abundance of hooded war- 
blers across a two-million-hectare area of the Yucatan 


Peninsula. Information on slope, aspect, geology, and 


land cover extracted from a GIS database were used to 
map habitat areas in Texas for two endangered song- 
bird species, the golden-cheeked warbler (Dendroica 
chrysoparia), and the black-capped vireo (Vireo atri- 
capillus) (Shaw and Atkinson 1990). 

GISs and categorical models can be combined to 
quantify habitat use across a landscape. Pereira and 
Itami (1991) applied a logistic regression model in a 
GIS environment to predict the distribution of Mt. 
Graham red squirrels based on environmental and lo- 
cational variables. Mladenoff et al. (1995) used a sim- 
ilar approach to predict the presence or absence of 
wolf packs as a function of spatial indices and land- 
scape variables. 

Forest songbirds have been identified as high-prior- 
ity indicators for ecosystem management in the Bara- 
boo Hills of Wisconsin (Baraboo Hills Working 
Group 1994). The purpose of the study described in 
this chapter was to apply a multidisciplinary tool 
(GIS) in developing predictive habitat models that 
could be used for the conservation of forest songbirds 
across a 580-square-kilometer landscape. Our first ob- 
jective was to determine whether we could reliably 
predict the distribution of six forest songbird species 
across the landscape by combining a GIS and a cate- 
gorical modeling procedure (logistic regression). As a 
result we wanted to identify important forest stand 
and landscape-scale habitat components and to pre- 
dict the distribution of songbirds based on these 
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variables. Our second objective was to demonstrate 
the usefulness of the resulting models for long-term 
management of the forest landscape by projecting 
how bird habitats would change under realistic future 


Scenarios. 


Study Area 


The Baraboo Hills ecosystem, located in south-central 
Wisconsin, is an area renowned for its biological di- 
versity. This status was formalized in 1994 when it 
was declared a “bioreserve” by The Nature Conser- 
vancy. Twenty-seven natural communities, seventy- 
seven rare or sensitive plant and animal species, and 
135 bird species have been documented within this 
area (Baraboo Hills Working Group 1994). This ex- 
ceptional biodiversity can be partially explained by 
high topographic relief and the area's location at the 
junction of three ecoregions: the Moraines District, 
the Driftless District, and the Central Sand Plains Dis- 
trict (Curtis 1959). Habitats of this ecosystem range 
from dry ridgetops covered with scrubby oaks and 
glade species to cool moist valleys harboring species 
such as white pine and northern hemlock. 

The Baraboo Hills contains 22,260 hectares of for- 
est surrounded by a largely agricultural landscape. 
The shallow, rocky soils of the Baraboo Hills made 
much of this area unsuitable for farming. Nearly 
6,475 hectares of continuous forest remain in the Hills 
in contrast to other southern forests in the Midwest, 
which have undergone extensive fragmentation. How- 
ever, human-related pressures increase daily, and only 
11 percent of the area is currently protected from de- 
velopment. The forest community types we studied for 
this project are not considered rare on a national level 
but are highly significant in the context of statewide 
conservation efforts. In other words, the Baraboo 
Hills is the only remaining large tract of deciduous 
forest in south-central Wisconsin. 


Methods 


To develop our models, we used available GIS data 
layers and collected field information on the distribu- 
tion of birds. The following sources of data were 
available: (1) vegetation data from a forest-stand in- 


ventory, (2) songbird data collected during point- 
count censuses, and (3) coarser landscape measures 
and spatial indices derived from GIS data (1:100,000 


scale). 


Forest-stand Inventory Data 


Data collected by The Nature Conservancy from 
1991 to 1993 provided information on stand-scale 
habitat variables that we believed would be impor- 
tant in determining the distribution of forest song- 
bird species (Robertsen 1995). Forest stands were de- 
fined as relatively homogeneous units of forest with 
similar plant species composition and structure. Our 
research took place in “southern forest" communi- 
ties (sensu Curtis 1959) in the Baraboo Hills because 
of the extent and importance of these types on the 
landscape (Clark et al. 1993a). We chose six forest- 
stand variables to use in model development: domi- 
nant tree type, dominant tree size class, dominant 
tree stocking density, understory type, understory 
size class, and disturbance class (Table 34.1; Robert- 
Senes 


Forest Songbird Data: Selection of Plots 


Point counts are an efficient censusing technique when 
the objective is to identify habitat use across a large 
geographic area (Bibby et al. 1992). A total of 550 
point-count census plots were established within 233 
southern forest stands to gather information on the 
distribution and habitat preference of songbirds 
(Robertsen 1995). In 1993, 468 plots were established 
to collect data necessary for the development of mod- 
els. In 1994, eighty-two new plots were censused for 
the purpose of validating the predictive model for one 
species, the Acadian flycatcher (Empidonax virescens) 
(Robertsen 1995). 

Census plots were visited once, and censuses were 
conducted within three hours of sunrise from June 1 
to July 2 in 1993 and from June 12 to June 19 in 
1994. This period covers the breeding season and the 
peak singing hours for male songbirds of most species 
in the Baraboo Hills (Mossman and Lange 1982). The. 
number of singing males of each species was recorded 
within a 50-meter radius of the observer during a five- 
minute time period. 
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TABLE 34.1. 


Forest songbird species of concern and eleven forest-stand and landscape variables used in the development of predictive 


songbird habitat models. 


. Forest songbirds Forest-stand variables 


Landscape (coarse-scale) variables 


Veery Dominant type: northern hardwoods, 
central hardwoods, oak 
Ovenbird Dominant tree size class (dbh): 


5-11 in., 11-15 in., 15-24 in. 


Chestnut-sided warbler 
northern hardwoods 


Eastern wood-pewee 


Great crested flycatcher Dominant tree stocking density 


(sq.ft/acre): 20-50, 51-85, 86+ 


Acadian flycatcher 


Understory type: central hardwoods, 


Understory size class (dbh): 0-5 in., 5-11 in. 


Disturbance class: grazed, unknown 


Distance to edge (meters): « 100, 100-200, 
> 200 


Distance to water (meters): «100, 100-200, 
> 200 


Percent forest cover within 200 meters (%): 
< 80, 80-99, 100 


Slope (%): <10, 10-25, >25 


Aspect: north, east/west, south 


Landscape-scale Data 


We expected that the following landscape-scale vari- 
ables might be important in a songbird habitat analyses: 
distance to edge, percent forest cover, distance to water, 
slope, and aspect (Table 34.1). These variables were 
measured by using the spatial analytical capabilities of 
ArcInfo GIS software. The first step in obtaining these 
measures involved creating a point coverage that 
mapped our 
community coverage mapped the distribution of forest 


bird census plots. A vegetation- 
and nonforest and allowed us to calculate percent forest 
cover and distance-to-edge measures for each census 
plot. ArcInfo GRID was used to obtain percent forest 
measures, which were grouped into three equal-sized 
categories: less than 80 percent cover, 80-99 percent 
cover, and 100 percent cover (Robertsen 1995). A 
stream data layer (USGS digital line graph data, 
1:100,000 scale, and USGS 7.5-minute quads, 1:24,000 
scale) was used as the source of information for meas- 
uring distance to water. Based on a digitized contour 
map of the area, a triangulated irregular network (TIN) 
was created to determine slope (less than 10, 10-25, 
greater than 25 percent) and aspect (Robertsen 1995). 


Model Development 


We began building our models by selecting songbird 
species (dependent variables) and habitat features (in- 
dependent variables). We chose songbird species if 
they were of special concern in the region (Thompson 


et al. 1992) and if we had detected at least thirty indi- 
viduals during our censuses. By choosing this as our 
cutoff, we made the decision to exclude rare and acci- 
dental species to the study area. We were most inter- 
ested in species such as the Acadian flycatcher that are 
still abundant in the Hills but considered less abun- 
dant or rare on a statewide scale. One or two years of 
surveys may be insufficient for sampling specialized or 
rare species, and such species may require specific sur- 
vey methods (Karl et al., Chapter 51). 

We selected eleven habitat variables based on evi- 
dence of biological significance and independence 
from other variables using a Pearson correlation 
analysis (Table 34.1, Robertsen 1995). Variables that 
were considered but excluded from analysis based on 
a correlation with another variable of 0.5 or greater 
included community type, understory density, and per- 
cent forest cover within 100 and 300 meters of a plot. 

Songbird and habitat data were noncontinuous and 
required a categorical approach to modeling. We ran a 
univariate analysis on the 1993 census data to select 
habitat variables for inclusion in models (SAS Institute 
1985). For each species, we evaluated the significance 
of each of the eleven habitat variables using a proce- 
dure outlined by Hosmer and Lemeshow (1989) and 
excluded those with p greater than 0.25 from further 
consideration. 

In another premodeling step, cross-tabulations 
were run between all combinations of dependent and 
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independent variables to check for marginal zeroes 
and to consider possible interactions between habitat 
variables. Because zero-cell frequencies may lead to 
erroneous results in categorical analyses due to the 
development of a singular (noninverted) covariance 
matrix (SAS Institute 1985), when possible, we com- 
bined categories where zero cells appeared to be a 
problem. 

For each species and its remaining set of habitat 
variables, we ran a multivariate analysis to derive a 
final equation. Models were developed using the pro- 
cedure CATMOD and followed a step-down method 
(SAS Institute 1985). We tested the importance of each 
variable removed using a likelihood ratio test (G sta- 
tistic). Final models included no more than two habi- 
tat variables due to general sparseness of the data (few 
bird observations/total number of counts) and for ease 
of application in a GIS environment. For these same 
reasons, models considered only main effects, or, in 
other words, no interactions between variables. 

The final model for each species included significant 
variables and showed a high probability of goodness 
of fit (Smith and Conners 1986; Hosmer and 
Lemeshow 1989). Model adequacy was determined by 
considering the residual (likelihood ratio) chi-square 
value and associated p value with a lack-of-fit test. 
The smaller the chi-square (and the higher the p) the 
better the model fits. Probabilities of a species being 
present or absent in each habitat class were generated 
(SAS Institute 1985). 


Review and Validation of Models 


To assess each model, we considered whether the final 
variables for each songbird made sense based on what 
we knew of the biology of a species (Mossman and 
Lange 1982; Robertsen 1995). Evaluations based on 
biological intuitiveness are a critical component of 
model validation (Burnham and Anderson 1991; Van 
Horne 1991). Using insights gleaned from a literature 
review, we deduced how songbirds would respond to 
stand-level variables based on their response to plot- 
level variables (Robertsen 1995). 

We evaluated the Acadian flycatcher model using 
an independent data set collected in 1994. Testing a 
model with an independent data set is believed to be 
the best measure: of accuracy (Fielding, Chapter 21; 


Capen et al. 1986). Karl et al. (Chapter 51) found 
that it is possible in some cases to develop and test 
models for more abundant species, with data from 
only one or two years. Censuses in 1994 were con- 
ducted at new plots within the six habitat types in- 
cluded in the 1993 Acadian flycatcher model. Be- 
cause of the large number of censuses required to 
statistically test the model, the greatest sampling ef- 
fort occurred at the extremes (i.e., sites with highest 
and lowest probability of occurrence). We compared 
observations for the Acadian flycatcher in each habi- 
tat class in 1994 to those predicted by the 1993 
model. 

Models for the other five species were not tested 
with independent data. However, confidence intervals 
on the predicted probability of occurrence in habitat 
of highest preference were calculated as an indication 
of model precision and are dependent on sample size 
(Table 34.2). Further application of these five models 
should be preceded by accuracy testing. 


Model Application 


One advantage of a GIS is its ability to create, display, 
and analyze the consequences of hypothetical changes 
to current landscape conditions. Using ArcInfo, we 
created scenarios for a 93-square-kilometer area of the 
Baraboo Hills and evaluated the implications of each 
scenario on songbird distributions. The scenarios were 
chosen to address likely land-use changes: (1) the com- 
bined effects of harvesting, succession, and fire man- 
agement in the southern dry mesic forest communities, 
and (2) fragmentation of all forest types due to hous- 
ing development. The assumptions required to create 
future conditions were based on expert opinion and 
existing literature (Clark et al. 19932). 

Our primary interest was in the future condition 
of relatively undisturbed mature stands of southern 
dry mesic forest and how changes in these stands 
would affect the songbird community. Southern dry 
mesic forests are dominated by a mixture of oaks 
(Quercus alba and Quercus borealis), basswood 
(Tilia americana) and sugar maple (Acer saccharum). 
These communities are considered unstable with: 
dominance by one species likely to last one genera- 
tion (Clark et al. 19933). Due to the economic im- 
portance of oak, intense harvesting could rapidly 


34.Predicting the Distributions of Songbirds in Forests of Central Wisconsin 


TABLE 34.2. 
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Results of categorical models for six species, including residual chi-square P for goodness-of-fit test, significant variables, habitat 


of highest preference, and probability of occurrence. 


Predicted probability 95% confidence 


Residual Habitat of of occurrence in interval for the 
chi-square highest l habitat of predicted 
Species P Variables and p-value preference highest preference probability 
Acadian flycatcher 0.88 distance to edge (0.006) > 200 m from edge, 0.45 0.31-0.58 
understory type (0.000) northern hardwoods 
Veery 0.12 distance to edge (0.046) » 200 m from edge, 0.23 0.13-0.37 
understory type (0.009) northern hardwoods 
Ovenbird 0.38 dominant tree density (0.000) stand basal area 86+, 0.64 0.49-0.77 
understory type (0.045) northern hardwoods 
Chestnut-sided 0.54 dominant tree size (0.002) 5-11 in. (dbh), 0.34 0.17-0.56 
warbler dominant tree density (0.000) stand basal area 20-50 
Eastern wood — dominant tree density (0.019) stand basal area 86+ 0.50 0.40-0.60 
pewee 
Great crested -— dominant type (0.030) central hardwoods 0.17 0.08-0.30 


flycatcher 


change the future composition and structure of 
southern dry mesic forests in the Baraboo Hills. 
Changes in the forest vegetation will affect habitat 
availability for forest songbirds. 

To simulate a future condition, we changed compo- 
sition and structure of 50 percent of the existing ma- 
ture stands by making assumptions about timber har- 
vesting, succession, and fire. A thirty-year timber 
projection for the state of Wisconsin estimates a 36.6- 
percent reduction in the volume of oak in the state 
(Spencer et al. 1988). Assuming a similar loss of oak 
in the Baraboo Hills—in other words, a 40 percent 
loss), we projected that half the reduction (20 percent) 
would occur due to harvesting and half (20 per- 
cent) through succession. Over thirty years, we further 
projected that 10 percent of the existing stands would 
retain or gain an oak component due to impacts of 
controlled burning. This projection was based on the 
current interest in using prescribed burns in the area. 
We assumed that dominant tree type, dominant tree 
size, dominant stocking density, and understory type 
changed as a result of harvesting, succession, and fire. 

Fragmentation as a result of housing development 
has become a growing threat to the large blocks of 
forest critically important to forest interior songbirds 


of the Baraboo Hills. Plans are underway to improve 
the main highway between the Baraboo Hills and 
Madison, Wisconsin, (a growing metropolitan area) 
less than 50 kilometers away. Easier commuting and 
the appeal of the scenic views from the Hills make 
them an attractive place to build new homes. Open- 
ings created by housing developments create hard 
edge types and lead to fragmentation of the forested 
landscape. We created two development scenarios, 
random and clustered, each with total area of new 
openings equal to 10 percent of the forest area. 

We evaluated impacts on songbirds by calculating 
the area of “preferred occupied habitat” remaining 
under each scenario. “Preferred habitat” is defined 
here as the conditions with the highest probability of 
occurrence for a given species. We calculated number 
of preferred territories by multiplying the area of pre- 
ferred habitat by the probability of occurrence and as- 
suming territory size to be 1 hectare (Temple and Cary 
1988). 


Results 


Based on regional concern and abundance on our cen- 
sus plots, six forest songbirds were selected for the 
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habitat-model-building process. The Acadian fly- 
catcher (Empidonax virescens), veery (Catharus 
fuscescens), chestnut-sided warbler (Dendroica pensyl- 
vanica), eastern wood-pewee (Contopus virens), and 
great crested flycatcher (Myiarchus crinitus) are of 
high regional conservation concern (Thompson et al. 
1992) and were relatively abundant ( more than thirty 
individuals detected during our censuses). The oven- 
bird (Seiurus aurocapillus) is of lower regional 
concern, but it was chosen for our project because it 
was the most frequently recorded species. 

Final models for each of the six species contained 
one to two variables (Table 34.2). Stand-scale vari- 
ables were significant in all developed models. Land- 
scape-scale (coarse-scale) variables were significant in 
two of the six models: the Acadian flycatcher and the 
veery. The equation for each of the final models is of 
the following form: 


In(P/1 — P) = baseline value + VAR1 effect + 
VAR2 effect +... 


where 

P = Probability that the site is occupied. 

Highest probability of occurrence ranged from 0.17 
to 0.64 depending on the species. Residual chi-square 
P ranged from 0.12 for the veery to 0.88 for the Aca- 
dian flycatcher (Table 34.2). Residual chi-square P 
could not be calculated for the two models with only 
one habitat variable. 

In Table 34.3, an example of a final model for the 
Acadian flycatcher is summarized. A difference in co- 
efficients between northern and central hardwoods of 
2.02 indicates that northern hardwood understory 


TABLE 34.4. 


TABLE 34.3. 


Logistic regression coefficients (Bj) for the effect of 
understory type and distance to edge on the occurrence of 
Acadian flycatchers (Empidonax virescens). 


Variable and variable level 1993 (n = 486) 1994 (n = 82) 
Understory type 

Central hardwoods -1.01 -0.65 
Northern hardwoods +1.01 +0.65 
Distance to edge 

Less than 100 meters -0.75 -7.51 
100-200 meters +0.06 +3.81 
More than 200 meters +0.69 +3.70 


types have increased odds of Acadian flycatcher occur- 
rence of exp(2.02), or 7.5 times over central hard- 
woods. Similarly, the odds of finding an Acadian fly- 
catcher in an area far from edge (more than 200 
meters) is exp(1.44), or 4.2 times greater than the 
odds of finding this species close to edge (less than 100 
meters) (Table 34.3; Robertsen 1995). The predicted 
probabilities of occurrence (P) for the final Acadian 
flycatcher model ranged from a low of 0.02 (central 
hardwood understory, close to edge) to a high of 0.45 
(northern hardwood understory, far from edge). 

We found that the probability of occurrence for 
Acadian flycatchers in each habitat class in 1994 
closely followed what was predicted based on the 
1993 model (Table 34.4). A chi-squared analysis indi- 
cated that differences between the two years were 
nonsignificant but that the test lacked the power re- 


Observed number of 1994 plots with Acadian flycatchers (Empidonax virescens) present and absent for six habitat classes 


compared to predicted numbers based on the 1993 model. 


————————————————'^R——— X 


Understory type and distance to edge 


Predicted present Observed present 


Predicted absent Observed absent 


Central hardwoods, « 100 m from edge 0.4 
Central hardwoods, 100—200 m from edge 0.2 
Central hardwoods, > 200 m from edge ARS 
Northern hardwoods, < 100 m from edge 32 
Northern hardwoods, 100-200 m from edge 1.6 


Northern hardwoods, > 200 m from edge 6.8 
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Figure 34.1. Predicted change in number of territories within preferred habitat for six species (Acadian flycatcher [Empidonax 
virescens], veery [Catharus fuscescens], ovenbird [Seiurus aurocapillus], chestnut-sided warbler [Dendroica pensylvanica], eastern 
wood-pewee [Contopus virens], great crested flycatcher [Myiarchus crinitus]) after thirty years of timber harvest, fire, and succes- 


sion in the Baraboo Hills of Wisconsin. 


quired to make a definite conclusion due to small sam- 
ple size in 1994 (X2 = 8.78, df = 7, p > 0.25). 


Model Application 


As a result of our harvesting/succession/fire scenario, 
four of the six species showed a decrease in abundance 
under projected future conditions: Acadian flycatcher 
(-5 percent), veery (-5 percent), ovenbird (-19 per- 
cent), and eastern wood-pewee (-11 percent) (Fig. 


TABLE 34.5. 


Predicted percent change in preferred territories for six forest 
songbird species after thirty years of timber harvest, fire, and 
succession in the Baraboo Hills of Wisconsin. 


Predicted percent 
change in preferred 


Common name/ Scientific name territories (%) 


Acadian flycatcher/ Empidonax virescens -5 
Veery/Catharus fuscescens -5 
Ovenbird/ Seiurus aurocapillus -19 
Chestnut-sided warbler/ 

Dendroica pensylvanica +222 
Eastern wood-pewee/Contopus virens -11 
Great crested flycatcher/Myiarchus crinitus +8 


34.1, Table 34.5). Two of the six increased in abun- 
dance: chestnut-sided warbler (+222 percent) and 
great crested flycatcher (+8 percent). The greatest ben- 
efit of this future scenario was realized by the chest- 
nut-sided warbler, a species that currently has much 
less preferred habitat across the landscape than the 
other five. The greatest negative effect (a 19 percent 
loss) occurred for the ovenbird, the species currently 
having the largest amount of “preferred” habitat. 
Five of the six songbird species experienced losses 
in total number of preferred territories (average 13- 
percent loss) under the random development scenario 
(Table 34.6). The losses resulting from a clustered de- 
velopment scenario ranged from 4 to 14 percent (aver- 
age 8 percent loss) (Table 34.7). The losses of pre- 
ferred territories greater than 200 meters from an edge 
was much higher under a random development sce- 
nario (average 25 percent loss) than under a clustered 
development scenario (average 11 percent loss) (Ta- 
bles 34.6 and 34.7). The Acadian flycatcher and the 
veery showed a habitat preference for interior forest 
and experienced the greatest loss in territories (Tables 
34.6 and 34.7). The ovenbird and the eastern wood- 
pewee did not show a large percentage loss in total 
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TABLE 34.6. 


Percent reduction in preferred territories for six forest songbirds following a random housing development scenario. 


Percent change in Percent change in preferred 
preferred territories more than 200 meters 

Common name/ Scientific name territories (96) from an edge (%) 
Acadian flycatcher/ Empidonax virescens -27 -27 
Veery/Catharus fuscescens -27 -27 

Ovenbird/ Seiurus aurocapillus -5 -30 
Chestnut-sided warbler/Dendroica pensylvanica 0 (0) 

Eastern wood-pewee/Contopus virens -7 -31 

Great crested flycatcher/ Myiarchus crinitus -10 -33 


TABLE 34.7. 


Percent reduction in preferred territories for six forest songbirds following a clustered housing development scenario. 


Percent change in Percent change in preferred 
preferred territories more than 200 meters 
Common name/ Scientific name territories (%) from an edge (%) 
Acadian flycatcher/ Empidonax virescens -14 -14 
Veery/Catharus fuscescens -14 -14 
Ovenbird/ Seiurus aurocapillus -4 -13 
Chestnut-sided warbler/Dendroica pensylvanica -4 0 
Eastern wood-pewee/ Contopus virens -5 -12 
Great crested flycatcher/Myiarchus crinitus -8 -13 


number of preferred territories but did show a large 
decrease in the number of preferred territories more 


than 200 meters from an edge. 


Discussion 


Several studies have indicated the importance of land- 
scape-scale features in determining the distribution of 
birds (Lauga and Joachim 1992; Robbins et al. 19892; 
Lynch and Whigham 1984). Many of the landscape- 
scale (coarse grain) measures we expected would be 
important were not statistically significant in any of 
the models. These included slope, aspect, distance to 
water, and percent forest cover. However, significance 
tests on data may mask “preference” of resources due 
to the subjectiveness of the researcher in defining 
available habitat (Johnson 1980). Although they are 
similar measures, distance to edge was a better predic- 


tor than percent forest cover within 200 meters and 


was a significant component in the Acadian flycatcher 
and veery models. 

On the other hand, forest-stand variables were im- 
portant predictors of songbird occurrence in all of our 
models (Table 34.2). Our forest-stand variables may 
have contained more useful information on habitat 
than can be measured by landscape-scale variables. 
Shriner et al. (Chapter 47) found topographic vari- 
ables to be more important than habitat variables in 
predicting songbird distribution but for a largely 
undisturbed landscape. Ongoing timber harvesting 
and other developments in the Baraboo Hills may re- 
duce the effectiveness of topographic variables in pre- 
dicting vegetation types. Understory type and domi- 
nant tree-stocking density factored into more models 
than other forest-stand variables. Understory type was 
better at differentiating bird habitats than either dom-. 
inant tree species or distance to water. Dominant tree 
size was a variable of significance for the chestnut- 
sided warbler—a species that utilizes early succes- 
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Figure 34.2. Preferred habitat for the Acadian flycatcher (Empidonax virescens) under a random development scenario. 


sional stands. This variable was also highly significant 
for the ovenbird, which prefers closed-canopy, mature 
forests. Our results support the notion that it is im- 
portant to consider macro- as well as micro-scale 
habitat features (Block and Brennan 1993). However, 
researchers should not neglect the importance of 
stand-level features in favor of coarser landscape-scale 
measures, especially in disturbed areas. 

The ability to develop predictive models for species 
that have a strong association with stand or plot-scale 
variables may be limited by the cost of collecting such 
data and by the resolution of GIS layers. Collecting 
forest stand inventory data for the Baraboo Hills was 
an extensive effort and required skill and experience 
on the part of the field crew. Observer bias may have 
affected the accuracy of this data layer whereas 


coarser-grain measures such as elevation could be 
obtained more objectively. In addition, forest stand in- 
ventory data is dynamic by nature and corresponding 
GIS layers must be updated on a regular basis. Be- 
cause of the time and resources required to obtain and 
maintain landscape-level resource data, it is critical to 
evaluate the ability of a shared database to meet mul- 
tiple objectives (Donovan et al. 1987). We were un- 
able to develop a model for the wood thrush (Hyloci- 
chla mustelina), possibly because this species prefers 
gaps within a closed canopy, which was not a forest- 
stand measure available in our database. If different 
songbird habitat measures had been collected during 
the forest-stand inventory, they might have improved 
the predictability of models or permitted development 
of models for additional species. 
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Figure 34.3. Preferred habitat for the Acadian flycatcher (Empidonax virescens) under a clustered development scenario. 
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Modeling Approach and Accuracy 


We applied an empirical modeling approach based on 
the results of 550 songbird censuses. Habitat capabil- 
. ity models (a theoretical approach) produce an index 
to habitat suitability ranging from zero (unsuitable) to 
one (highly suitable). It is difficult to know whether 
these indices reflect environmental conditions or pop- 
ulation response (Morrison et al. 1992). Our ap- 
proach directly measures and quantifies population re- 
sponse (probability of occurrence) within a given 
habitat type. 

The final models met our expectations with regard 
to habitat selection by each species, but the goodness- 
of-fit test varied from poor for the veery to good for 
the Acadian flycatcher (Table 34.2). As might be ex- 
pected, the two species with the highest goodness of fit 
are both habitat specialists, the Acadian flycatcher and 
the chestnut-sided warbler. The Acadian flycatcher's 
preference for closed-canopy, southern-mesic forest 
stands (Mossman and Lange 1982) is revealed by the 
importance of the northern understory variable. The 
chestnut-sided warbler's association with early succes- 
sional habitats is revealed by the importance of tree 
size and density in that model. On the other hand, the 
veery is considered a habitat generalist with regard to 
forest structure, which may explain the relatively poor 
fit of that model. Dettmers and Bart (1999) found this 
same pattern in their development of GIS models for 
songbirds. 

For all six species modeled, probability of occur- 
rence in the most preferred habitat was relatively low 
(Table 34.2). Improved measures of habitat may in- 
crease these probabilities. Karl et al. (Chapter 51) 
found that models developed for heterogeneous land- 
scapes (like the Baraboo Hills) required higher-resolu- 
tion habitat data. More likely, this pattern is a result 
of factors other than habitat playing a role in the dis- 
tribution of a species, such as competition, predation, 
stochastic events, weather, and regional population 
status. Others have found that predictive habitat- 
based models often only account for half the variation 
observed in species abundance or density (Morrison et 
al. 1992). The advantage to our approach is that with 
our 550 censuses we directly measured this variability 
rather than assuming that all *preferred habitat" was 


occupied. 


One might expect the best models to be developed 
with species that are habitat specialists yet are abun- 
dant enough to provide adequate sample size. Our 
ability to predict the occurrence of a species using this 
method did not appear to be closely tied to abun- 
dance, with the exception of very rare species. The 
Acadian flycatcher and chestnut-sided warbler were 
detected on less than 12 percent of our census plots 
yet produced the best models. The ovenbird was de- 
tected on 38 percent of our plots, but the final model 
has a lower p value (Table 34.2). This lack of a close 
relationship between abundance and our ability to 
model preference may be tied to the scale of our study 
area. The accuracy of model predictions generally im- 
proves at very coarse scales, in other words, the state 
of Wisconsin versus the Baraboo Hills (Boone and 
Krohn 1999). In addition, all of the six species we de- 
veloped models for could be classified as *abundant" 
as compared to those we chose to exclude (those 
species present on less than thirty stops). Five of the 
six models require additional tests to determine their 
true accuracy since accuracy is largely independent of 
goodness of fit (Fielding, Chapter 21). 

As would be expected (Karl et al., Chapter 51), we 
were unable to develop models for several species of 
high concern that occurred in very low densities, such 
as the mourning warbler (Oporornis philadelphia) 
and hooded warbler (Wilsonia citrina), due to a high 
number of zero occurrences in some habitat types. In 
addition, our methodology would not work well for 
species with a low level of detectability since it 
is based on identifying bird presence in preferred 
habitats. 

Predictive models should not be used to forecast 
into the future until they have been validated against 
an independent data set (Morrison et al. 1992). The 
Acadian flycatcher model is the only model that we 
tested with an independent data set. Results of this 
test indicate that the 1994 observations were not sig- 
nificantly different from the 1993 model predictions, 
but the test lacked the power to make a definite con- 
clusion (Robertsen 1995). Better models could possi- 
bly be derived with increased sample size (Dettmers 
and Bart 1999). The performance of these models is 
unlikely to be as high as the performance of models at 
using coarser grain-habitat features over larger areas 
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(Karl et al., Chapter 51; Fielding, Chapter 21). How- 
ever, these models are still useful for land managers in 
the Baraboo Hills by identifying potential habitats for 
protection. Models are unlikely to perform at 100 per- 
cent, and their usefulness should be judged based on 
the desired management application and scale (Van 
Horne, Chapter 4; Fielding, Chapter 21). 


Management Applications 


The fire/harvesting/succession and housing develop- 
ment scenarios demonstrate how these models can be 
applied by conservationists. If harvesting and succes- 
sional trends continue, the chestnut-sided warbler will 
benefit the most, whereas the ovenbird will be the 
most negatively affected (Fig. 34.1). Most timber har- 
vesting in the Baraboo Hills takes place in central 
hardwoods stands; however, those stands with an un- 
derstory of northern hardwoods have the highest 
habitat value for the ovenbird, Acadian flycatcher, and 
veery. Impacts to habitat can be reduced in such cases 
by directing timber harvesting to stands with a central 
hardwoods understory or applying selection harvest- 
ing methods. Selection versus clearcut logging would 
also benefit the eastern wood-pewee by retaining more 
canopy cover. The great crested flycatcher showed a 
small increase in number of territories correlated with 
the succession of oak stands. The chestnut-sided war- 
bler may benefit from created openings, but nesting 
success in these areas needs to be examined. Cluster- 
ing housing development helped reduce loss in territo- 
ries for most species, especially for the Acadian fly- 
catcher and veery, which were significantly more likely 
to occur far from an edge (Figs. 34.2, 34.3). 


Conclusion 


We looked at the current and future distribution of six 
forest songbirds of conservation concern across a re- 
gionally important landscape. All models included for- 
est-stand variables, thus emphasizing the importance 
of habitat measures at this scale. Only the Acadian fly- 


catcher and the veery models contained a landscape- 
scale (coarse-scale) measure, distance to edge, as a sig- 
nificant predictor. We applied these models to look at 
the future impacts of timber harvesting, forest succes- 
sion, prescribed burning, and housing developments. 
Under our timber harvest/succession/prescribed-burn 
scenario, preferred habitat increased for two of the six 
species, the chestnut-sided warbler and the great 
crested flycatcher but decreased for the Acadian fly- 
catcher, veery, ovenbird, and eastern wood-pewee. 
Under a development scenario, random housing devel- 
opment resulted in a 13 percent average decrease in 
abundance as compared to an 8 percent average de- 
crease for clustered housing development. These aver- 
age losses were greater for forest interior habitats: 25 
percent for random development and 11 percent for 
clustered development. Based on a test of the Acadian 
flycatcher model in 1994, the models we developed 
using 1993 data looked promising. 
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Poisson Regression: A Better Approach to 
Modeling Abundance Data? 


Malcolm T. Jones, Gerald J. Niemi, JoAnn M. Hanowski, 


and Ronald R. Regal 


D many ecological studies researchers undertake 
what may be an elusive search for relationships be- 
tween the occurrence and/or abundance of a particu- 
lar species and its environment. The majority of these 
studies each use some form of regression technique in 
an attempt to develop a statistically valid model that 
subsequently can be used for predicting or elucidating 
an ecological process. In reviewing the last five years 
(1995-1999) of Ecology, Ecological Applications, 
Ecological Monographs, and Conservation Biology, 
we found that 43 percent of the articles discussing the 
development of models for predicting the occurrence 
or abundance of a species used multiple linear regres- 
sion, 27 percent used logistic regression, 34 percent 
used other parametric techniques (e.g., principal com- 
ponent analysis), and 2 percent used classification and 
regression trees. Thus, we agree with Morrison et al. 
(1992) and Trexler and Travis (1993) that one of the 
most commonly used statistical techniques to develop 
predictive models relating habitat to abundance has 
been multiple linear regression. We suspect that this 
phenomenon is due to (1) the historical use and ac- 
ceptance of this technique in the scientific literature, 
(2) the fact that more ecologists are exposed to this 
technique in statistics courses than to the more mod- 
ern regression techniques, and (3) the ease of inter- 
preting the results. It is interesting to note that in our 
literature review, we found no articles describing the 


use of Poisson regression in developing predictive 
models of species abundance. 

With the continually increasing power and sophisti- 
cation of personal computers and statistical software, 
ecologists have greater access to more modern regres- 
sion techniques than ever before. The most notable 
modern regression techniques are those based on gen- 
eralized linear models (GLMs), including logistic 
(Hosmer and Lemeshow 1989; Agresti 1990; Menard 
1995; Long 1997), Poisson (Hastie and Pregibon 
1997; Long 1997; Venables and Ripley 1997), and 
negative binomial (Long 1997; Venables and Ripley 
1997). All three of these regression models are para- 
metric because they are predicated on the assumption 
that the data conform to a particular frequency distri- 
bution. The most commonly used of these techniques 
is logistic regression, with numerous examples in the 
ecological literature (Osborne and Tigar 1992; Trexler 
and Travis 1993; Smith 1994). The use of Poisson re- 
gression in ecological studies is uncommon (Welsh et 
al. 1996; Vernier et al., Chapter 50). Another statisti- 
cal approach is to use an adaptive or nonparametric 
technique (i.e., no a priori assumption about the un- 
derlying distribution) such as classification and regres- 
sion trees (Breiman et al. 1984) or neural networks 
(Ripley 1996). Recently, several papers have discussed 
classification and regression tree models (Michaelsen 
et al. 1987; Walker 1990; Moore et al. 1991; Walker 
and Cocks 1991; Baker 1993; Michaelsen et al. 1994; 
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Figure 35.1. Frequency distribution of the sampled number of individuals of Blackburnian warbler (Den- 
droica fusca) from a long-term population monitoring study in northeastern Minnesota and northwestern 
Wisconsin. For details of this study, see Hawrot et al. (1998). 


Hernandez et al. 1997; O'Connor et al. 1996; O'Con- 
nor and Jones 1997; Fertig and Reiners, Chapter 42; 
Thomas et al., Chapter 10). 

The selection of any technique must be based on 
the type of data to be analyzed. First, the researcher 
must decide whether the data are measured on a ratio, 
interval, ordinal, or nominal scale, and whether these 
data are discrete or continuous (Zar 1984). Second, 
how well do these data conform to the underlying as- 
sumptions of the particular statistical procedure (e.g., 
normal distribution, linear relationships, homoscedas- 
ticity)? It is not uncommon for ecological data, such 
as abundance data, to show a highly skewed fre- 
quency distribution (Fig. 35.1). Although testing of 
the underlying assumptions of statistical tests is rela- 
tively common, it is likely that less emphasis has been 
placed on matching the technique to the type of data 
that has been collected. 

A large proportion of floral and faunal surveys use 


count data or density estimates, standardized to a 
given areal unit (e.g., number of individuals per 10 
hectares). Thus, it is not uncommon for ecologists to 
work with data that are discrete and were measured 
on a ratio scale (i.e., count data). By making this 
rather simple determination, we can use techniques 
that have been developed for analysis of count data, 
such as Poisson regression. A least-squares linear re- 
gression model could give us inefficient and biased pa- 
rameter estimates for this type of data (Long 1997). 
Although Preston (1948, 1962a,b) introduced the con- 
cept of the veiled normal distribution to explain highly 
skewed abundance data, use of linear regression 
model with count data results in the possibility of pre- 
dicting a negative abundance estimate. Biologically, 
such an outcome is unfeasible and only complicates 
the interpretation of the regression model. Logistic re- 
gression, on the other hand, requires one to discard 
meaningful abundance information. 
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Poisson Regression 


The Poisson regression model (PRM) has several fea- 
tures that make its use appealing for ecological appli- 
' cations. PRM assumes an underlying Poisson distribu- 
tion, which is defined in equation 35.1 where P(X) is 
the probability of X occurrences and X is the count of 
events. 


(35.1) 


Although it is common to see ji referred to as the 
rate at which X events occur, it also can be expressed 
as the expected count (i.e., population mean count) 
(Long 1997). As u increases, the Poisson distribution 
approximates a normal distribution (Fig. 35.2). This 
allows one to fit a wide range of apparent distribu- 
tions with a unified modeling approach. Additionally, 
PRM is constrained by its underlying distribution to 
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be bounded by zero at its lower end. Lastly, we can 
obtain predictions from PRM either on the scale of the 
response variable (i.e., an actual count) or as a proba- 
bility of occurrence. 

Most modern statistical software (e.g., SAS, S-Plus, 
and Systat) have routines for performing GLMs, and 
thus PRM. GLMs are fit using maximum likelihood 
estimation instead of ordinary least squares (Hastie 
and Pregibon 1997; Long 1997). The major assump- 
tions of PRM are that (1) the data follow a Poisson 
distribution, and (2) the events are independent. Lack 
of fit between the data and the Poisson distribution is 
often attributed to overdisperson, an inequality of the 
variance and mean, or to the rate of the count variable 
varying between individuals (i.e., heterogeneity) (Long 
1997). These potential problems have led to the devel- 
opment of many extensions of PRM, such as zero-in- 
flated Poisson regression (Lambert 1992; Welsh et al. 
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Figure 35.2. Examples of the Poisson distribution as à (equivalent to u in Equation 35.1) increases. Low val- 
ues of A, upper left histogram, result in right-skewed frequency distributions. As A increases, the distributions 


begin to approximate the normal distribution. 
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1996; Long 1997), truncated Poisson regression (Long 
1997), and even negative binomial regression 
(Long 1997; Venables and Ripley 1997). For a very 
readable discussion of PRM and the other methods 
mentioned above, see Long (1997). 

The main purpose of this chapter is to increase 
awareness about regression techniques more appropri- 
ate than linear regression for ecological analyses, 
without necessarily reducing relative abundance infor- 
mation to presence/absence categories as required by 
logistic regression. Secondarily, we compare the per- 
formance of Poisson regression with that of logistic re- 
gression in a simulation exercise. 


Methods 


Wildlife surveys typically provide relative abundance 
data assumed to represent true densities of the organ- 
ism being studied but that in fact do not necessarily do 
so. The accuracy of surveys depends on the detectabil- 
ity of individuals, which is influenced by the life- 
history traits of the organism as well as the habitat 
being sampled. Using real-world data to compare the 
performance of these two regression techniques would, 
under the best scenario, give us results confounded 
with the sampling design and, under the worst sce- 
nario, could give us false or misleading results. There- 
fore, we decided to use a Monte Carlo simulation to 
generate data for a single species so that we would be 
certain of the “truth” (i.e., the actual number of birds 
in a given sampling unit) and accurately compare the 
two regression procedures. We chose to generate our 
virtual birds in a real landscape that is a predominantly 
forested 1.1-million-hectare region located in St. Louis 
and Lake Counties in northeastern Minnesota (see Fig. 
35.3 in color section). The land cover was classified 
using a Thematic Mapper (TM) satellite image with a 
ground resolution of 30 by 30 meters. The satellite im- 
agery and resultant classification were obtained as part 
of an ongoing, long-term study designed to assess the 
distribution and abundance of breeding birds in 
forested regions of Minnesota (Hanowski and Niemi 
1994, 1995). The original forty-two land-cover classes 
were reduced to six classes for this analysis to simplify 
the bird/habitat relationships we would be simulating. 


We used ARC/INFO GIS software (version 7.2, ESRI) 
to identify individual patches of each cover type. 

The birds in our simulation use conifer patches 
larger than 0.16 hectares (40x40 meters), which ex- 
cluded all conifer patches too small to support one in- 
dividual. For each iteration of the simulation, infor- 
mation on the 71,695 conifer patches was used as 
input for allocating a random number of birds to each 
patch. The number of birds randomly assigned to each 
patch was determined in the following manner. 

Two variables, TerrSize (size of an individual terri- 
tory) and Occ_Rate (number of birds per territory) 
were provided for each run of the simulation. TerrSize 
was allowed to vary from 0.5 hectare, representing 
high population density, to 10 hectares, representing 
low population density. To reduce the complexity of 
the comparisons, Occ_Rate was held constant at 0.6 
for all simulations. A measure of density per hectare 


Occ_Rate 


Riel Lys. — 
he TerrSize 


(35.2) 
was calculated by dividing the occupancy rate, 
Occ Rate, by territory size, TerrSize (Eq. 35.2). 

For each conifer patch we calculated the expected 
number of birds, Expbirds, based on the area of each 
patch, Patch. Size (Eq. 35.3). This was used to gener- 
ate a random Poisson deviate using a random number 
generator from Press et al. (1992). The resulting value 
was the true number of birds, Nbirds, in a given 
patch. 


Expbirds = Bird Den*Patcb. Size (35.3) 


After populating all patches with a known number 
of birds, we sampled the landscape using 2,500 ran- 
domly located, nonoverlapping 25-hectare sample 
units. For each sample unit, the sampled number of 
birds was calculated by weighting the true number 
of birds in a patch by the proportion of the patch ac- 
tually sampled (Eq. 35.4), 


Samp, Birds = > Nbirds, *Samp Area (35.4) 

n-i 
where Samp. Birds is the number of birds in the sam- 
ple, 7 is the number of patches in the sample grid, and 
Samp. Area is the proportion of each conifer patch in 
the sample grid. The number of birds sampled, 
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Samp Birds, was constrained to be an integer, round- 
ing fractions of birds to the nearest integer. If the num- 
ber of birds sampled was greater than zero, then we 
. recorded that the species was present in that patch; 
otherwise, the species was recorded as absent. 

Both Poisson and logistic regressions were per- 
formed on the resultant data sets from our simula- 
tions. We randomly selected two hundred grids (10 
percent) from each run of the simulation to be in- 
cluded in a training, or test, set. For the presence/ 
absence data, we fit a GLM in which we specified a bi- 
nomial distribution with a logit link (Eq. 35.5). Simi- 
larly, we fit a GLM in which we used a Poisson distri- 
bution with a log link to fit the actual count of 
sampled birds (Eq. 35.6). 


PA = a + b *Samp, Area; (35.5) 


Samp, Birds = a + b *Samp Areas, (35.6) 


In the logistic regression mode, PA indicated 
whether the species was present or absent and 
Samp. area;,, was the total conifer area in the sampled 
grids. Samp_Birds was the total number of birds sam- 
pled in each grid for the Poisson regression model. 
Statistical significance for the logistic and Poisson re- 
gression model was assessed using the difference be- 
tween the null deviance and the residual deviance and 
calculating a P-value by assuming a chi-square distri- 
bution (Venables and Ripley 1997). Statistical signifi- 
cance was assessed for all simulations at an alpha level 
of 0.05. All statistical analyses were conducted using 
S-plus (S-Plus 2000 for Windows, MathSoft). 

For each run of the simulation, we used the esti- 
mated regression coefficients from both regression 
models to predict the number of individuals or proba- 
bility of occurrence expected in each conifer patch in 
our study area. Goodness of fit for each model to the 
simulated data was assessed by calculating the root 
mean squared error. For the logistic regression model 
(LRM), the root mean squared error is given by Equa- 
tion 35.7, where Pr(P) is the predicted probability of 
occurrence for each patch. This probability is sub- 
tracted from the true probability (0.0 or 1.0) of an oc- 
currence obtained from each simulation. In order to 
compare the root mean squared error of the LRM 
with that of the PRM, we had to convert the predicted 


number of individuals into the probability of obtain- 
ing a count of zero (i.e., absence). Thus, for the PRM, 
the root mean squared error is given by Equation 35.8 
where Pr(0) = e~” is the predicted probability of an 
absence for each patch and m is the predicted count. 
This probability is subtracted from the true probabil- 
ity (0.0 or 1.0) of an absence obtained from each 
simulation. 
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Results 


Ten simulations were conducted for each of six terri- 
tory sizes, 0.5, 1, 2, 3, 6, and 10 hectares, while the 
occupancy rate, Occ_Rate, was held constant. Sam- 
ples obtained from the simulated data had frequency 
distributions similar to those obtained in actual pop- 
ulation monitoring studies (Fig. 35.1; Fig. 35.4). All 
sixty logistic regressions were statistically significant, 
with P-values less than or equal to 0.05. Similarly, all 
sixty Poisson regressions were statistically signifi- 
cant, with P-values less than or equal to 0.05. 

The two regression models performed similarly 
across the range of simulated densities (Fig. 35.5). 
The difference between the root mean squared error 
ranged from a high of 25 percent at a territory size of 
0.5 hectare to a low of 0 percent at a territory size of 
10 hectares (X = 9.5, s.e. = 3.7, N = 6). There was an 
unexpected curvature in the graph of root mean 
squared error, which suggests that both the Poisson 
and logistic regressions perform better when individ- 
uals are either always present or absent (Fig. 35.5). 
Examination of the residuals revealed that Poisson 
regression tended to overpredict abundance when the 
true number of birds was zero and underpredict 
abundance when the true abundance greater than 
zero (Fig. 35.6). 
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Figure 35.4. Frequency distribution of the sampled number of individuals from a realization of the simulation at 
two density levels. Each histogram is based on two hundred randomly placed 25-hectare sample grids. 


Discussion 


Poisson and logistic regression models were found to 
perform similarly on our simulated data sets. Both types 
of regression models had root mean squared errors that 
averaged less than 0.40 (logistic = 0.363 and Poisson = 
0.397) across all simulated bird densities. However, 
Poisson regression tended to overpredict abundance at 
sites known to have low densities and underpredict 
abundance at sites known to have high densities. This 
error may be acceptable, however, given that logistic re- 
gression is only able to predict the probability of occur- 
rence, which is difficult to relate to abundance. 

There are several other advantages to using PRM. 
First, no transformations are required in order to meet 
normality assumptions. Transformations tend to com- 
plicate the interpretation of the results. Second, PRM 
is a nonlinear regression technique but the models are 
specified as if it were a linear model. Thus, errors in 
specifying the correct form of the model are avoided. 

As with any simulation study, there are limits to 


our study due to the assumptions of our design. Since 
we derived our simulated data from an underlying 
random Poisson process, it is not surprising that we 
obtained relatively good fits to the Poisson regression 
model. However, our results are applicable when the 
data being analyzed meet the assumptions of Poisson 
regression, namely that the events are independent 
and the data are generated by a Poisson process. Un- 
fortunately, little is known about the true, underlying 
distribution of bird abundance in nature (see Small- 
wood, Chapter 6). 

So why did our regression models not perform bet- 
ter than they did? We suspect that this behavior could 
be due to one or a combination of the following fac- 
tors. First, we ignored the effects of autocorrelation 
that may occur due to the spatial configuration of the 
conifer patches in our landscape. Although we did not 
quantify the degree of spatial autocorrelation in the 
size of conifer patches, there is evidence that such non- 
randomness exists in this landscape. This would be a 
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Figure 35.5. Comparison of the performance of Poisson and logistic regression models using the root mean 
squared error. For the Poisson regression, the error is the difference between the predicted probability of a zero 
count and the true probability of a zero count for each sample. Likewise, the error for the logistic regression is the 
difference between the predicted probability of occurrence and true probability of occurrence. The root mean 
squared error for each realization of the simulation is calculated using 1,709 sample grids not used during model 


building. 


violation of the assumption of Poisson regression that 
all events are independent. Second, we chose to use a 
completely random sampling scheme that might not 
have produced a truly representative training set given 
the nonrandom distribution of patch sizes in the land- 
scape. If a training set does not include the full range 
of possible values, it is unlikely that any statistical 
technique will be able to make accurate predictions 
outside the range of values encountered during model 
building. Third, our sampling scheme may have ade- 
quately sampled the full range of possible values but 
differentially sampled the response variable so that the 
resultant frequency distribution did not match that of 
the process that generated the data (i.e., the sampled 
data might be overdispersed). 

How might one deal with these possibilities? One 
can explicitly model the spatial autocorrelation as sug- 


gested by Klute et al. (Chapter 27) or Smith (1994). 
Overdispersion of these data can be handled in one of 
two ways. First, one can fit a Poisson regression model 
and account for the overdispersion by specifying an 
additional parameter that must be specified a priori 
(Long 1997; Venables and Ripley 1997). A second op- 
tion is to fit a regression model based on a negative bi- 
nomial distribution (Long 1997), which is a special- 
ized case of Poisson regression. A third option, and 
perhaps the most intriguing but technically challeng- 
ing, is to fit a zero-inflated Poisson (ZIP) regression 
(Lambert 1992; Welsh et al. 1996). ZIP regression is a 
mixture of logistic regression for the zero data and a 
Poisson regression for the positive integer data. The 
technical details of ZIP regression are beyond the 
scope of this chapter, but the reader is directed toward 
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Figure 35.6. Studentized residuals for a Poisson regression model. The data used were generated for one re- 
alization of the simulation with a territory size equal to 3 hectares. Note that there is a tendency for Poisson 
regression to overpredict the number of individuals when the true number of birds is equal to zero and to un- 
derpredict when the true abundance is greater than zero. 


Lambert's (1992) original paper or Long's (1997) dis- 
cussion of this technique. 

In summary, we have shown that Poisson regres- 
sion performs similarly to logistic regression when 
using data that has a frequency distribution similar to 
real abundance data. Based on the results of our simu- 
lation, the mean difference in assessing the probability 
of occurrence was only 10 percent. When compared to 
the loss of information that occurs when converting 
abundance data to presence/absence categories, we be- 
lieve this difference is not biologically significant. 
However, we should caution managers that predictive 
models developed using Poisson regression may be 
overly optimistic when used to model data from rare 


species and somewhat conservative when used for 
common species. 
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36 


Predicting Vertebrate Üccurrences from 
Species Habitat Associations: Improving the 
Interpretation of Commission Error Rates 


Sandra M. Schaefer and William B. abr 


ji e step in conservation is determining where 
animal and plant species occur. However, con- 
ducting complete field inventories of vertebrate occur- 
rences is generally infeasible. So, wildlife-habitat rela- 
tionship models are often used to predict species 
presence, absence, or relative abundance. Since our 
knowledge of species habitat use is limited, validation 
of these models is essential (Morrison et al. 1992; 
Csuti 1996; Krohn 1996). One common testing 
method is to compare the predicted occurrences to 
species lists obtained from test sites having long-term 
field inventories. Omission error (percentage of species 
present but not predicted), commission error (percent- 
age of species predicted but not present), and the per- 
centage of species matched (percentage of species pres- 
ent that are predicted) can then be used to evaluate 
model reliability (e.g., Scott et al. 1993; Edwards et al. 
1996; Fielding and Bell 1997). 

Problems with the above validation metrics are 
often encountered in the interpretation stage (Krohn 
1996). There are many factors associated with species 
biology and the methods used that can influence errors 
and complicate their interpretation, including the 
presence of species on tests sites that go unsurveyed 
(Nichols et al. 1998b; Boone and Krohn 1999; Karl et 
al., Chapter 51). Further, size of test sites and defini- 
tions of species presence on a site can influence com- 
mission and omission errors (S. Schaefer in prepara- 


tion). Generally, a close examination of the species 
being omitted and the data layers used in the predic- 
tion process will often identify the cause of the omis- 
sion error. In contrast, commission error is more trou- 
blesome, with a key issue being the need to assess if 
the error reported is an actual error (the species is not 
present on the site) or if it is an apparent error (the 
species is present on the site but has not been recorded 
as a result of incomplete field inventories). For exam- 
ple, since the publication of the predicted distributions 
from the Idaho Gap Analysis Project, the sharp-tailed 
grouse (Tympanuchus phasianellus) has been con- 
firmed to be present in areas where it had never before 
been recorded (Scott et al. 1993). In this case, com- 
mission error could be viewed as an apparent error of 
the prediction and not an actual error. (The species 
was present at the time of the GAP prediction.) 

Rare and reclusive species can be difficult to detect 
during standard field surveys designed to inventory a 
wide variety of species. Thus, these species are likely 
to have higher estimates of commission error when 
predicted occurrences are compared to known field 
observations. Boone and Krohn (1999) recognized 
that biological characteristics of species can influence 
detectability and proposed that an a priori ranking 
system based upon the likelihood of detection could 
be related to commission error. Using avian occur- 
rences from the Maine Breeding Bird Atlas (MBBA) 
(Adamus 1987), they established a ranking system 
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called Likelihood of Occurrence Ranks (LOORs), 
which ranked all of the birds known to breed in 
Maine based upon how frequently they occurred in 
towns within their range limit (see below). In a gap- 
like analysis, they observed a strong correlation be- 
tween LOORs and commission error on five of six test 
sites (p = 0.86-0.93, P = 0.002; Boone and Krohn 
1999); 

The purpose of the analysis discussed in this chap- 
ter was to determine if the a priori ranking system of 
Boone and Krohn (1999) improves the interpretation 
of the commission errors resulting from species- 
habitat models designed to predict presence or ab- 
sence for gap analysis. The Gap Analysis Program 
(GAP) is a nationwide effort of the U.S. Geological 
Survey (USGS), Biological Resources Division (BRD), 
designed to assess some elements of biodiversity (Scott 
et al. 1993; Scott et al. 1996). GAP uses models pri- 
marily based on species-habitat associations, along 
with other data, such as range limits and vegetation, 
in a geographic information system (GIS), to predict 
the presence of terrestrial vertebrates that breed in a 
state (Scott et al. 1993). Data for this analysis came 
from the Maine Gap Analysis Project (Maine GAP) 
(Krohn et al. 1998), and our objective was to deter- 
mine if LOORs and commission error were corre- 
lated. If rates of commission are constant across 
LOORs, then overprediction by the habitat models 
would be suggested (i.e., actual errors). 


Methods 


For this study, avian LOORs were calculated by 
Boone and Krohn (1999) and herptile LOORs were 
tabulated by Krohn et al. (1998). In both studies, atlas 
occurrence information was used to generate a spatial 
incidence for all species. Mammals were not included 
in this analysis because no unbiased surveys of inci- 
dence existed for Maine. To calculate the avian 
LOORs, Boone and Krohn (1999) used occurrence 
data from the MBBA (Adamus 1987). The spatial inci- 
dence was calculated by dividing the number of 
MBBA blocks having confirmed or potential breeding 
occurrences by the number of MBBA blocks within 
the species range. Because the MBBA was more than 
fifteen years old, they updated the spatial incidences 


using logistic regression to model a suite of avian 
species-specific variables, including data taken from 
the USGS, BRD Breeding Bird Survey (BBS) during the 
period of the MBBA. The outdated MBBA data was 
then replaced with the new information, giving up- 
dated incidences. These incidences were sorted and as- 
signed a rank, which became the species LOORs. Low 
ranks indicate the species has low detectability, and, 
conversely, high-ranking species are those that are eas- 
ier to detect in the field. Species with inadequate data 
available to assign spatial incidences were given a rank 
of zero and excluded from the correlation analysis 
(Boone and Krohn 1999). 

Krohn et al. (1998) used occurrence information 
from Maine Amphibian and Reptiles (Hunter et al. 
1999) to calculate herptile LOORs. Since the informa- 
tion in the amphibian and reptile atlas was recent, 
there was no need to conduct additional modeling to 
update the data, as was done for avian species. Inci- 
dences for amphibians and reptiles were combined 
into one list of herptiles, sorted, and then ranked, giv- 
ing the LOORs for each species. Combining the two 
taxonomic classes was done to increase the sample 
size used in the correlation analysis. 

Predicted occurrences from the Maine Gap Analysis 
Project (Boone and Krohn 1998a,b) were compared to 
records from nine sites in Maine having field surveys. 
Amphibian, reptile, and bird occurrences came from 


checklists complied by National Park Service (NPS 


1990, 1996) and the U.S. Fish and Wildlife Service 
(USFWS) (USFWS 1989, 1994a,b, 1995, 1996; Table 
36.1, Fig. 36.1). Additional avian occurrences were 
also obtained from field inventory and research 
records from the White Mountains National Forest 
(D. Capen, Univ. of Vermont personal communica- 
tion), Baxter State Park (Oliveri 1993), and two pri- 
vately owned areas (Hagan et al. 1997; J. Witham, 
Univ. of Maine personal communication; Table 36.1, 
Fig. 36.1). 

For each site, the number of species correctly pre- 
dicted and the number in commission were tabulated 
and compared to five groups of species for birds and 
three groups of species for herptiles. Species were as- 
signed to groups based upon LOORs (ranging from 
low to high) with equal number of species per group 
(as much as possible). This was done to remove any 
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TABLE 36.1. 


Test-site names, data type, and available information used in testing the accuracy of the vertebrate predictions from Maine Gap 


Analysis. 
] Site Survey Amphibians 

no. Name of test site Size (ha) length and reptiles Birds 
dl North Maine Forestlands Study, Moosehead Lake Area 293 1a X 
2 Nesowadnehunk Field, Baxter State Park 177 3a X 
3 White Mountains National Forest 181 5a X 
4 Sunkhaze Meadows National Wildlife Refuge 3,833 10> x 
5 Holt Research Forest 172 15a xX X 
6 Petit Manan National Wildlife Refuge 993 22b X 
if Rachel Carson National Wildlife Refuge 1,768 326 X X 
8 Moosehorn National Wildlife Refuge 9,297 615 X 
9 Mount Desert Island/Acadia National Park 28,033 79b X X 


aNumber of years the area has been surveyed. 


bActual number of survey years unknown, so the number of years in existence is instead reported. 


possibility of bias that might have occurred by includ- 
ing species that do not occur on a particular site due 
to range limits in the state. Spearman's Rho (alpha - 
0.05) was used to quantify the relationship between 
species counts and the LOORs group for each taxo- 
nomic class on each site. 


Results 


Overall, the mean commission error for amphibians 
and reptiles was low (x = 12.3 percent, range 0 to 
36.8 percent) and the mean percentage of species 
matched was high (x = 97.3 percent, range 92 to 100 
percent; Table 36.2). No trend was apparent when 
combined amphibian and reptile errors were plotted 
for each LOOR group (Fig. 36.2). A valid Spearman's 
Rho analysis on commission error could only be con- 
ducted for the Holt Research Forest, because on 
Rachel Carson National Wildlife Refuge there was no 
commission error, and Mount Desert Island had too 
many ties in the number of species matched to con- 
duct a rank correlation test (Table 36.3). The Spear- 
man's rho for the Holt Research Forest indicated that 
there was not a significant relationship between 
commission error and the LOOR groups (p = -0.5, 
P < 0.704; Table 36.3). The correlation between the 
number of species matched and LOORs was also not 
significant (p = 0.5, P < 0.704; Table 36.3). Commis- 
sion error was much higher for birds than it was for 


herptiles (x = 76.8 percent + 45.8, compared to 12.3 
percent + 21.2; Table 36.2). The number of bird 
species matched and the LOOR groups were posi- 
tively correlated (p = 0.6 to 1.0) on all sites (n = 9; Fig. 
36.3). Relationships were significant (P € 0.05) for all 
sites except for the White Mountains National Forest 
(P < 0.291) and Acadia National Park (P < 0.059) 
(Table 36.4). An inverse relationship was observed be- 
tween commission error and the LOORs (p = -0.87 to 
—1.0) (Fig. 36.3). The Spearman’s rho tests confirmed 
the significance of this relationship on all sites except 
Acadia National Park (P < 0.059; Table 36.4). 


Discussion 


Field surveys are often incomplete inventories of the 
species present in a given area (e.g., Nichols et al. 
19982). Factors such as species detectability, the num- 
ber of years a survey has been conducted, and the 
amount of effort placed in searching for species all in- 
fluence which species will be recorded and which will 
be missed during a survey (Boone and Krohn 1999; 
Karl et al., Chapter 51; Fielding, Chapter 21). We 
found that a priori ranking species based upon how 
likely a species is to be observed during a field inven- 
tory helps to detect the effects of incomplete field sur- 
veys on model validation. 

An initial interpretation of the commission errors 
reported for Maine GAP, without correcting for 
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. 1- North Maine Forestlands, Moosehead Lake 
2 — Nesowadnehunk Field, Baxter State Park 
3 - White Mountains National Forest 
4 — Sunkhaze Meadows National Wildlife Refuge 
5 — Holt Research Forest 
6 — Petit Manan National Wildlife Refuge 
7 — Rachel Carson National Wildlife Refuge 
8 — Moosehorn National Wildlife Refuge 
9 — Mount Desert Island / Acadia National Park 


Figure 36.1. Locations of test sites used in the accuracy assessment of predicted distributions of terrestrial 


vertebrates from Maine Gap Analysis. 


incompleteness of the inventories, would indicate that 
the models are overpredicting about 76 percent of the 
bird species in the state (Table 36.2). Block et al. 
(1994) faced a similar problem with their predictive 
models. They reported commission errors ranging 
from 29 to 44 percent and felt that this level of error 
was unacceptable. In both studies, these findings could 


lead researchers to believe that the species-habitat as- 
sociation models have not been correctly constructed. 
However, the inverse correlation we observed between 
commission error and LOORs indicates that many of 
the errors reported in the Maine GAP predictions are 
related to the species detectability. Thus, by examining 
the models within an a priori ecological context of 
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Figure 14.5. (a-f) Variation in the quality and spatial distribution of yellow-billed cuckoo (Coccyzus americanus) 
habitat from 1938 to 1997 on the Sacramento River, river-miles 196-219. 
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Figure 15.2. Distribution of land-cover types in the two study areas. 
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Figure 17.1. Summer distribution map of brown-headed cowbird (Molothrus ater). 
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Figure 18.2. Three-level ecoregional importance valuation for Santa Cruz County, California. Central areas of high im- 
portance are influenced by the distribution of redwood forest across the ecoregion. Urban and agriculture areas are 
included for reference. Locator map shows Santa Cruz County within the Jepson Central West ecoregion. 


& 
< Current Urban Land Use 
A Forecast Urban Land Use 
| KILOMETERS 
ES 10 I—E- Q 10 20 
NO, 5 0 5 10 15 


MILES 


Figure 18.3. Forecast of Urban Growth. Fifty-year forecast for Santa Cruz County, California. Only forecast areas larger 
than 100 hectares are shown. 
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Figure 19.1. The Klamath Province in northwestern California with boundaries of national forests, 
ecological zones, and Late Successional Reserves. 
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Figure 22.1. Current (A), burned (B), and restored (C) landscapes used to map predicted 
use areas for sage sparrows (Amphispiza belli) by D2 and Pearson’s planes of least fit. 
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Figure 22.2. Predicted use areas for sage sparrows (Amphispiza belli) in current (A), 
burned (B), and restored (C) landscapes measured by the Mahalanobis distance (D2). We 
converted D? into a x? probability of similarity to habitat used by sage sparrows at 36 
sample sites. Maps correspond to landscape in Figure 22.1. 
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Figure 22.4. Predicted use for sage sparrows (Amphispiza belli) in current (A), burned (B), and restored 
(C) landscapes measured by the values of d42/A4 of the first plane of closest fit (k = 1). Values closest to 
zero represent highest similarity to the plane. Maps correspond to landscapes in Figure 22.1. 
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Figure 24.1. Map of the Central Highlands RFA (Regional Forest Agreement) region, with modeling survey sites. 
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Figure 25.3. Potential maps for Carex curvula, obtained with (a) an ordinal GLM (generalized linear model), (b) 
a Poisson GLM, and (c) a Gaussian GLM. 
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Figure 25.4. Potential maps for Trifolium alpinum obtained with (a) an ordinal GLM (generalized linear model), 
(b) a Poisson GLM, and (c) a Gaussian GLM. 
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Figure 31.1. Probability of occurrence values for the magnolia warbler (Dendroica magnolia) in 
the Great Lakes region. 
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Figure 33.1. Atlas of breeding western meadowlarks (Sturnella Figure 33.2. Atlas of breeding Baird's sparrows (Ammodramus 
neglecta) in North Dakota: (a) Original data; filled squares de- bairdii) in North Dakota: (a) Original data; filled squares denote 
note confirmed or suspected evidence of breeding during confirmed or suspected evidence of breeding during 


1950-1972 (from Stewart 1975). (b) Results of simple spatial 1950-1972 (from Stewart 1975). (b) Results of simple spatial 
smoothing. (c) Results of smoothing based on search effort. smoothing. (c) Results of smoothing based on search effort. 


Figure 33.3. Atlas of breeding blue jays (Cyanocitta cristata) in North Dakota: (a) Original data; filled 
squares denote confirmed or suspected evidence of breeding during 1950-1972 (from Stewart 
1975). (b) Results of simple spatial smoothing. (c) Results of smoothing based on habitat informa- 
tion. (d) Results of smoothing based on habitat information and search effort. 


Figure 33.4. Simulated (known) distribution of hypothetical highly detectable species used to evalu- 
ate methods: (a) Original data; filled squares denote presence of species. (b) Results of simple spa- 
tial smoothing. (c) Results of smoothing based on habitat information. (d) Results of smoothing 
based on habitat information and search effort. 


Figure 33.5. Simulated (known) distribution of hypothetical species with low detectability used to 
evaluate methods: (a) Original data; filled squares denote presence of species. (b) Results of simple 
spatial smoothing. (c) Results of smoothing based on habitat information. (d) Results of smoothing 
based on habitat information and search effort. 
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Figure 33.6. Weights (w*) used in simple spatial smoothing method, and Steps taken to determine 
weights. Target cell (shaded) and immediate neighbors contribute data for determining the probability 
of a species’ occurrence in the target cell. (p) = 1 if the species is known to occur in a cell: other- 
wise, Kp) = O. The variable c denotes the distance between the center of a data cell and the center 
of the target cell and is used to estimate d, the average distance between points in the two cells. 
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Figure 35.3. Location of landscape used in simula- 
tion, and detail of land cover classification. 


Figure 37.1. (A) Map of Oregon with sixty-six land-cover classes. (B) Map of Ore- 
&on with thirteen land-cover classes. Modified from Oregon Gap Project. 
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Figure 38.1. Deer mouse (Peromyscus maniculatus) presence/absence model (left) and ranked model (right). 
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Figure 38.2. Meadow vole (Microtus pennsylvanicus) presence/absence model (left) and ranked model (right). 
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Figure 54.2. Model output and observed bird survey data for yellow-billed cuckoo (Coccyzus amer- 
icanus) on the Deerfield Ranger District of the George Washington and Jefferson National Forest 
in Virginia. Portions of eight bird survey routes are pictured, with black dots indicating locations 
where yellow-billed cuckoos were present and white dots where they were absent. (continues) 
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Figure 54.2. (continued) Model outputs from the four different methods (logistic re- 
gression [A], Mahalanobis distance [B], CART [C], and discriminant analysis [D]) are in- 
dicated in the legends. 


Figure 55.1. Results of GAP (medium gray) and GARP (dark gray) modeling approaches for wood thrush (Hyloci- 
chla mustelina), with test presence points (light dots) and all BBS sample points (black dots) overlaid. 


Spatial random effect predictions 


Figure 56.4. Top: Map of predicted random spatial effects at the grid locations. Bottom: Prediction standard er- 
rors (marginal posterior standard deviations). Route locations are indicated by an X. Contours in the top panel 
are from —0.75 to 1, incremented by 0.25. 
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Figure 56.5. Top: Map of estimated expected route count at the grid locations. Bottom: Prediction standard er- 
rors. Route locations are indicated by an X. Contours in the top panel are from 10 to 60, incremented by 10. 
Contours in the bottom panel are from 5 to 30, incremented by 5. 
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Figure 56.6. Top: Percent forest cover. Bottom: Map of predicted route effects (shown again to facilitate 
comparison). 
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Figure 58.2. Potential distribution of the lynx in the Swiss Jura Mountains according to the model 
derived from the three sets of response variables (a = both sexes combined, b = females alone, and 
c = males alone). Lines represent the dispersal route of subadult lynx (F = females, M = males); 
polygons represent transient or definitive home ranges. Subadult lynx that survived the first year of 
independence are shown in black, and those that died are shown in red. 


Figure 15.1. Red-winged Blackbird (Agelaius phoeniceus) hierarchy of spatial! population units from deme to species at 
Columbia National Wildlife Refuge, WA. 
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Figure 59.3. Habitat patches for both individual and population level are derived from the 
habitat-value map, using a neighborhood averaging function. 
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Figure 59.4. Meta-population patches are created by first modifying the resources map to reflect the ma- 
trix quality as a surrogate to resistance to movement between population patches. 
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Figure 60.3. The Idaho Southern Batholith and location of 890-hectare area for silvicultural 
treatment, pre-treatment white-headed woodpecker (Picoides albolarvatus) habitat assess- 
ment (a), 52 hectares of forests subjected to treatment (b), and posttreatment white- 
headed woodpecker habitat assessment (c). 
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Figure 62.1. Map of the Olympic Peninsula, Washington, showing ownership patterns. Federal lands include 
National Forest system (NFS) lands and National Park Service (NPS) lands. Lands designated as “Other” in- 
clude industrial and other privately owned areas. 
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Figure 62.3. Distribution of northern spotted owl (Strix occi- 
dentalis caurina) nesting and roosting habitat on the Olympic 
Peninsula, Washington, based on Scenario 2, retention of all 
nonfederal habitat. (A) depicts percentages of habitat within 
1,500-hectare hexagonal cells as used in simulation models. 
(B) shows projected rates of occupancy by simulated pairs of 
northern spotted owls, calculated as percentage of years occu- 
pied by a breeding pair over all replicated 100-year simulation 
runs under rule set B; (C) depicts results under rule set D. 
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Figure 63.1. Changes in species densities across a riparian habitat gradient in southeastern Ari- 
zona. Cottonwood/willow gallery forest predominates within 200 meters of the river channel, giving 
way to mesquite woodland in the floodplain and desert scrub outside the riparian zone. The curves 
illustrate variation in species density for three bird species (a) yellow warbler (Dendroica petechia), 
a riparian specialist that favors cottonwood/willow habitat and exploits the edge with mesquite 
woodland; (b) Abert's towhee (Pipilo aberti}, an edge exploiter that favors mesquite-dominated habi- 
tats; and (c) black-throated sparrow (Amphispiza bilineata), which prefers desert-scrub habitats and 
exploits the edge with mesquite woodland. 


Figure 63.4. Thematic Mapper Simu- 
lator (TMS) image and simplified 
habitat grid for the modeled land- 
scape along the upper San Pedro 
River corridor in southeastern Ari- 
zona. The landscape has been clas- 
sified using three habitat classes 
that correspond to the most abun- 
dant habitat types: cottonwood/wil- 
low gallery forest, mesquite wood- 
land, and desert scrub. 
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Figure 63.5. Edge habitat grid. The Effective Area Model cre- Figure 63.6. Edge proximity grid. The Effective Area Model de- 
ates a grid discriminating areas within each habitat class, termines the distance to the nearest edge for each pixel in the 
based upon the habitat type that forms the nearest edge. This habitat grid. This grid serves as the template for projecting the 
information is used to select the appropriate edge response appropriate edge response curve onto the habitat grid. 


curve (Figure 63.2) to apply to each pixel when estimating ani- 
mal density across the landscape. 
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Figure 63.7. Animal density grids. Panel (A) shows the results of the null 
model. Panel (B) depicts the results of the Effective Area Model's projec- 
tion of edge response functions onto the combined edge habitat grid, gen- 
erating a map of expected animal density across the modeled landscape. 
Note that, for our example, yellow warbler (Dendroica petechia) densities 
are highest along the cottonwood/willow-mesquite edge. 


1] Cottonwood/willow 
Mesquite 
J Desert scrub — 


Figure 63.8. Habitat type conversion 
following an aquifer draw-down sce- 
nario, resulting in a reduction in river 
flow through a desert riparian land- 
scape. In the case illustrated, 10 
hectares of cottonwood/willow habi- 
tat have shifted to mesquite wood- 
land, and 40 hectares of mesquite 
woodland have shifted to desert 
scrub. 


A. Initial Habitat B. After Habitat Conversion 
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Figure 63.9. The Effective Area Model applied to the riparian habitat de- 
picted in Figures 63.4 and 63.8 to estimate predicted animal density be- 
fore and after habitat-type conversion following a hypothetical reduction 
in river-flow volume. Panel (A) illustrates predicted density for the yellow 
warbler (Dendroica petechia) in the existing habitat, while panel (B) 
shows predicted bird densities for the same reach following conversion 
of riparian habitats to more xeric types. 


TABLE 36.2. 


Percentage and number of species matched and in commission for test site used in the predicted vertebrate accuracy 


assessment; sites are ordered by length of field inventory. 
———(———Ha& € — Manes —— À0—— - — "-—. —— OR — eee 


Matches? Commission error? 
Test Site No. present Count Percent Count Percent 
Amphibians and reptiles , 
Holt Research Forest 19 19 100 a 36.8 
Rachel Carson National Wildlife Refuge 32 32 100 0 0 
Mount Desert Island/Acadia National Park 25 23 92 al 0.04 
Mean (+ st. dev) 97.3 (+ 4.6) ders) (Ge 22 2) 
Birds 
North Maine Forestiand T2 72 100 67 93.1 
Nesowadnehunk Field, Baxter State Park 55 55 100 76 138.2 
White Mountains National Park 74 74 100 68 91.9 
Sunkhaze Meadows National Wildlife Refuge 114 aa 97.4 39 34.2 
Holt Research Forest 60 57 95.0 81 435.0 
Petit Manan National Wildlife Refuge 92 92 100 64 69.6 
Rachel Carson National Wildlife Refuge 79 79 100 74 93.6 
Moosehorn National Wildlife Refuge 137 des 97.1 25 18.2 
Mount Desert Island /Acadia National Park T35 134 99.3 23 17.4 
Mean (+ St. Dev) 98.7 (+ 1.83) 76.8 (+ 45.8) 


ePercent matched = [number of species predicted present that were present / number of species present on the site] * 100. 
bCommission error = [number of species in predicted but not present on the site / number of species present] * 100. 


TABLE 36.3. 


Results of tests of Likelihood of Occurrence Ranks? for each test site having amphibian and reptile surveys in 
Maine; number of years the site has been potentially surveyed shown in parenthesis. 


LOOR 
Low High 
1 2 3 p P 

Site 5: Holt Research Forest (15 years) 
Number of species predicted 8 9 9 
Number in commission 4 gi 2 -0.5 0.704 
Number of predicted species present 4 8 T 0.5 0.704 
Site 8: Rachel Carson National Wildlife Refuge (32 years) 
Number of species predicted 10 ala dH 
Number in commission 0 .0 0 — — 
Number of predicted species present 10 alal T1 — — 


Site 9: Mount Desert Island/Acadia National Park (79 years) 


Number of species predicted 8 8 8 
Number in commission 0 1 0 0.0 1.0 
Number of predicted species present 8 7 8 0.0 1.0 


aLOORs, defined by Boone and Krohn (1999) 


TABLE 36.4. 


Results of tests of Likelihood of Occurrence Ranks? for each test site having bird surveys in Maine; number of years the site has 
been potentially surveyed shown in parenthesis. 


LOORs 
Low High 
D 1 2 3 4 5 p P 

Site 1: North Maine Forestlands, Moosehead Lake area (1 year) 
Number of species predicted 3 27 27 28 21 27 
Number in commission 3 20 16 16 Y 5 -0.97 0.006 
Number of predicted species present [6 ¢ alat 12 20 22 1.0 0.01 
Site 2: Nesowadnehunk Field, Baxter State Park (3 years) 
Number of species predicted 5 25 25 26 25 25 
Number in commission 5 22 16 15 10 8 -1.0 0.01 
Number of predicted species present 0 3 9 alal d5 Aly LO 0.001 
Site 3: White Mountains National Forest (5 years) 
Number of species predicted 2 28 28 28 28 28 
Number in commission 2 22 24 10 8 2 -0.9 0.042 
Number of predicted species present (0) 6 4 18 20 16 0.60 0.291 
Site 4: Sunkhaze Meadows National Wildlife Refuge (10 years) 
Number of species predicted 3 29 29 31 29 29 
Number in commission E 14 11 5 5 1 -0.97 0.006 
Number of predicted species present (0) 15 18 26 24 28 0.90 0.042 
Site 5: Holt Research Forest (15 years) 
Number of species predicted 2 2 2n 28 2 27 
Number in commission 2 25 24 am 9 4 -1.0 0.01 
Number of predicted species present 0 2 3 aak 18 23 NO 0.001 
Site 6: Petit Manan National Wildlife Refuge (22 years) 
Number of species predicted 4 30 30 32 30 30 
Number in commission 4 aL 20 13 9 it -1.0 0.01 
Number of predicted species present 0 13 10 19 2 29 0.90 0.042 
Site 7: Rachel Carson National Wildlife Refuge (32 years) 
Number of species predicted 8 29 29 29 29 29 
Number in commission 8 19 18 16 9 4 -1.0 0.001 
Number of predicted species present 0 10 alal, s 20 25 10 0.001 
Site 8: Moosehorn National Wildlife Refuge (61 years) 
Number of species predicted 4 30 31 32 Si 30 
Number in commission 3 6 6 6 3 1 -0.89 0.045 
Number of predicted species present al 24 25 26 28 29 1.0 0.001 
Site 9: Mount Desert Island/Acadia National Park (79 years) 
Number of species predicted 5 30 30 2 30 30 
Number in commission 3 7 8 5 0 0 -0.87 0.059 
Number of predicted species present 2 23 22 p 30 30 0.87 0.059 l 


aLOORs, defined by Boone and Krohn (1999). 
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Figure 36.2. For each test site with amphibian and reptile data, the number of species correctly modeled (denoted by a circle and 
solid line) and the number of species in commission (denoted by a dashed line and a square). Sites are ordered from smallest to 
largest: (a) Holt Research Forest, (b) Rachel Carson National Wildlife Refuge, and (c) Mount Desert Island/Acadia National Park. 


species detectability we were able to determine that 
much of our commission was due to apparent rather 
than actual errors in the models. We therefore suspect 
that the models are adequately predicting the presence 
of species in Maine. Although, additional effort needs 
to be put into surveying for those species with low 
LOORSs to be fully confident. 

The highest correlations between LOORs and the 
number of species in commission occurred on smaller 
sites with shorter surveys, which also indicates that 
the errors are apparent rather then actual (Fig. 36.3). 
Surveys such as those for the North Maine Forestlands 
and White Mountains National Park (conducted one 
year and five years, respectively) have not been estab- 
lished for a period of time long enough to capture the 
presence of the more uncommon and reclusive species 
(Schaefer in prep.). In addition, these surveys came 
from research projects having a specific objective of 
surveying forest songbirds (Hagan et al. 1997; D. 
Capen, Univ. of Vermont personal communication). 
These factors in an a posterior evaluation of error may 
lead researchers to incorrectly conclude that actual er- 
rors are present in the models (Edwards et al. 1996) 
when it is more likely that many of the errors are re- 
lated to incomplete field surveys. 

A more or less constant rate of commission error 
across LOORs on test sites with long histories of field 
inventories would indicate real overpredictions are 
being reported on the site (Boone and Krohn 1999). 


This can be seen in the number of species in commis- 
sion on Moosehorn National Wildlife Refuge and 
Acadia National Park (Fig. 36.3). The moderate corre- 
lation for these sites (x = —0.89 and -0.87) suggests 
that the species lists for these areas are relatively com- 
plete. Given that these sites have been surveyed for ex- 
tended periods of time (sixty-one and seventy-nine 
years respectively; Table 36.1) it is reasonable to con- 
clude that the species occurring on the sites have been 
well documented. However, even with a constant rate 
of error, care in the interpretation process must still be 
taken. The lack of correlation might be due to having 
too small of a sample size or by having too many 
LOORs groups (data spread too thinly). For example, 
too small of a sample size was problematic in this 
analysis with the herptile data. On all sites, predic- 
tions were relatively accurate, having a high percent- 
age of species matched (x = 97.3 percent) and rela- 
tively little commission error (x = 12.3 percent; Table 
36.2). On the site where a significant amount of com- 
mission was reported (36.8 percent), the number of 
species separated into the LOORs groups in the rank 
correlation test was extremely small, and thus our 
ability to detect a significant correlation between 
LOORs and commission was weak (Table 36.2 and 
393): 

Because of similarities between this analysis and 
that of Boone and Krohn (1999), further investiga- 
tions still need to be made into the use of a priori 
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Number of Species 


LOORs Group 


Figure 36.3. For each test site with avian data, the number of species correctly modeled (denoted by a circle and solid line) and 
the number of species in commission (denoted by a dashed line and a square). Sites are ordered from smallest to largest: (a) 
North Maine Forestlands, Moosehead Lake Area; (b) Nesowadnehunk Field, Baxter State Park; (c) White Mountains National Park; 
(d) Sunkhaze Meadows National Wildlife Refuge; (e) Holt Research Forest; (f) Petit Manan National Wildlife Refuge; (g) Rachel Car- 
son National Wildlife Refuge; (h) Moosehorn National Wildlife Refuge; (i) Mount Desert Island/Acadia National Park. 


ranking of species detectability to separate apparent 
from actual error in species predictions. There are 
major differences between these two studies, however, 
which are worthy of mention. First, the predictions of 
vertebrate occurrences we used were based upon an 
operational gap analysis, meaning that a statewide 


vegetation and land-cover map created with remotely 
sensed data (Hepinstall et al. 1999) was the main data 
layer underlying species predictions (Boone and Krohn 
1998a,b). In contrast, the vertebrate predictions re- 
ported in Boone and Krohn (1999) were not based on 
a statewide vegetation map but instead relied on the 
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data available for each site, which in some cases in- 
cluded lists of vegetation cover (Boone and Krohn 
1999). Unlike Boone and Krohn (1999), we wanted to 
study amphibians and reptiles as well as birds. Because 
there are few herptiles breeding in Maine, the number 
of LOOR groups was reduced to three. The number of 
LOOR groups identified by Boone and Krohn (1999) 
for birds was reduced from ten to five. This change 
tended to smooth some of the graphs of the number of 
species correctly modeled verses LOORs (e.g., note 
Moosehorn National Wildlife Refuge in Fig. 3b of 
Boone and Krohn [1999] versus our Fig. 36.3h) but 
did not change the overall patterns. Finally, this study 
reports data from nine test sites, three more then were 
used by Boone and Krohn (1999). 


Conclusions 


Having an ecological context in which to evaluate 
commission error is important and will help investiga- 
tors to have greater confidence in their predictive 
models (Fielding and Bell 1997). The ideal situation in 
validating habitat-association models designed to pre- 
dict the presence/absence of terrestrial vertebrates 
would be to have standardized field censuses captur- 
ing the presence of all species on test sites to compare 
to the predicted distributions. However, until such de- 
tailed surveys are available an a priori ranking system, 
such as LOORs, will permit fuller interpretation of 
rates of commission but helping to distinguish com- 


mission errors that are actual verses those resulting 
from incomplete field data. An ability to distinguish 
actual errors of overprediction from incomplete test 
data is critical to understanding the level of confidence 
model users should have in their findings. 
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Assessment of Spatial Autocorrelation in 
Empirical Models in Ecology 


Mary Cablk, Denis White, and A. Ross Kiester 


A Brief Overview of Error and 
Error Sources 


The existence of error in spatial analyses is a well- 
known occurrence (Veregin 1989). In statistical mod- 
els, for example, error is introduced by simply calcu- 
lating and using a mean value. We can estimate the 
error associated with a population mean by calculat- 
ing the standard deviation or setting confidence limits. 
The resulting loss of accuracy is termed random error 
because there is no regularity in the direction or mag- 
nitude of the error. Systematic errors, on the other 
hand, are those that result from the introduction of a 
fixed and consistent difference from the true value. Er- 
rors are generated with successive iterations of analyt- 
ical processing and as such become compounded. This 
compounding of error from multiple processing itera- 
tions is termed error propagation. 

At the most fundamental level, error is introduced 
into an analysis by limiting precision. Rounding num- 
bers from four decimal points to two, for example, in- 
troduces error. Additional calculations involving the 
parameter that has been rounded may or may not be 
precise, but they will be somewhat erroneous. At. this 
level, we assume that such introduced error is within 
acceptable bounds of what we consider noise. How- 
ever, as the analysis continues, we continue to round 
and thus continue to propagate error. As analyses in- 
crease in complexity, additional error types may be in- 


curred. Although error and error propagation has 
been studied in detail (Chrisman 1982; Walsh et al. 
1987; Lunetta et al. 1991; Lanter and Veregin 1992; 
Haining and Arbia 1993; Fielding, Chapter 21; Karl et 
al., Chapter 51), tracking error propagation in prac- 
tice is difficult, particularly in spatial data. 

Spatial analyses have undetectable errors that are 
generated at the basic production level. Thickness of 
pen, the wobble of a hand-drawn line, the number of 
nodes, or any other fundamental step in the mapmak- 
ing process introduces error to a thematic or other 
data layer. In an effort to introduce some level of stan- 
dardization and quality, the U.S. Geological Survey 
(USGS) adopted the 1947 Bureau of Budget National 
Map Accuracy Standards (NMAS) to produce stan- 
dard map products with a known degree of certainty, 
or uncertainty, in horizontal and vertical map dimen- 
sions. Map errors are usually not accounted for in 
spatial analyses because they are difficult to identify, 
quantify, and rectify. Veregin (1989) identified five di- 
mensions, or categories, of opportunities for error in 
spatial databases: thematic, cartometric estimates, 
data compilation, geographic information system 
(GIS) operations, and other general issues. Of these di- 
mensions, the easiest to quantify are those of classifi- 
cation of continuous data, such as remotely sensed im- 
agery. This error, thematic error, can be accounted for 
and in many instances can be resolved with concen- 
trated and guided effort. 
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Thematic error is simply whether or not a given 
area, pixel, polygon, or other spatial map feature, is 
an accurate representation of the landscape or 
mapped feature. This does not account for whether 
or not the classification scheme itself is an accurate 
representation of the landscape. Error matrices, 
sometimes referred to as confusion matrices, are gen- 
erated for classifications with the purpose of evaluat- 
ing a map product for thematic accuracy. That is to 
say, does this point on the map accurately represent 
what is on the ground? The associated error matrix 
provides the user with values indicating how true the 
map product is with respect to actual conditions on 
the ground. Error is calculated for each class and for 
the overall mapped area. Additionally, errors of 
omission and commission can be calculated to give 
the user information to update and refine the classifi- 
cation or for estimating confidence on a per-class 
basis in additional analyses. However, unless each 
minimum mapping unit, which may be a pixel, poly- 
gon, or any other feature with spatial properties, has 
been verified on the ground, there is still unknown 
and unquantified error. Furthermore, this error is un- 
known geographically. 

An understanding of accuracy and error in classi- 
fied or other thematic maps is critical when incorpo- 
rating these data into a greater analysis. In studies of 
biodiversity, for example, thematic maps representing 
life zones, ecoregions, land cover, or vegetation com- 
position are used to design field sampling strategies, 
conduct surveys, estimate habitat availability, and re- 
late presence or absence to geographic locations. If 
these data are inaccurate, then subsequent analyses in- 
corporating these data will be flawed as well. The dif- 
ficulty in estimating and tracking these errors approx- 
imates the impossible. 

An alternative to using thematic maps for correla- 
tion analyses of ecological data is to use continuous 
data, such as time series. Continuous data that re- 
tains its spatial component can be collected at any 
scale (grain) and analyzed using geostatistical meth- 
ods to quantify pattern. Continuous data can be 
collected from many sources, such as field data, time 
series instrument recordings, or even satellite 
imagery. 


Implication of Error 
in Ecological Modeling 


There will always be error in modeling and computing 
because perfection does not exist where approximation 
or estimation is involved. To avoid errors associated 
with the process of creating thematic maps and to cir- 
cumvent propagating unknown vectors and magnitudes 
of error, we might choose to use continuous data in 
analyses. Standard statistical methods can be used on 
continuous data and error terms can be estimated (i.e., 
standard deviation, [e], etc.). The difficulty of analyzing 
spatial data with standard statistical techniques for hy- 
pothesis testing is that the assumption of independence 
is violated (Fortin et al. 1989; Legendre and Fortin 
1989). Spatial data are inherently autocorrelated. In na- 
ture, ecological phenomena or discrete features influ- 
ence neighbors, are influenced by neighbors, or both. 
The level of neighbor influence may vary with distance 
and other factors, making the autocorrelation function 
nonlinear. In general, the closer the neighbors, the 
greater the level of correlation, which distorts statistical 
tests of significance in analyses such as correlation, re- 
gression, or analysis of variance (Cliff and Ord 1981). 

In ecology, we use this inherent autocorrelation to 
our advantage to make predictions where sampling 
does not or cannot occur. Therefore, we face a 
dilemma in ecological studies: we seek to explain eco- 
logical patterns but we have yet to develop tools that 
allow us to do so without violating our own set of 
terms. The inherent structure of the distribution of the 
natural world prohibits us from accurately explaining 
or evaluating the underlying pattern-process phenom- 
ena. Because of this paradigm, spatial analyses tend to 
be conducted either on continuous data analyzed 
using geostatistics or on a combination of thematic 
data operated on in a GIS. Associative wildlife habitat 
relationship models are an example of where true spa- 
tial methods would greatly advance our ability to 
quantify pattern-process relationships. 


Quantifying Pattern-Process Relationships 
with Both Discrete and Thematic Data 


Traditional habitat models are based on the concept 
or technique of categorization. For example, certain 
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species are known to occur or are correlated with spe- 
cific habitats. If we want to quantify a species-habitat 
relationship, we can mathematically relate the pres- 
ence or absence of the particular species with the oc- 
currence, amount, or distribution of a specific habitat 
type or types. When a statistically significant relation- 
ship is derived, the results can be explained based on 
an empirically derived statistical model. In reality, 
species are not related to class type or habitat type, 
rather, there exists an underlying relationship between 
the species and compositional landscape features or 
other characteristics of a certain class or habitat type. 
These components may include productivity, vertical 
vegetative complexity, temperature or photo gradients, 
among others. The reification of habitat by categoriza- 
tion in these circumstances thus serves to introduce a 
level of subjective interpretation that may mask the 
true correlations that exist between species and their 
environments. 

If we further refine the analysis of species-habitat 
relationships and pose the question “what mecha- 
nisms shape the patterns of biodiversity,” we en- 
counter more difficulties using categorically based 
methods. For example, the relationship between 
species diversity and a landscape comprising of sixty- 
six categories will be different from the relationship 
between diversity and the same landscape classified 
into thirteen categories (see Fig. 37.1 in color section). 
Therefore, the modeled relationships for species-habi- 
tat interactions or presence/absence predictions will 
vary depending on the land-cover base map used. The 
relationships found between an individual species and 
a sixty-six-class base map may be interpreted differ- 
ently than if a base map with fewer or a greater num- 
ber of categories were used. 

A level of uncertainty exists in the use of categori- 
cal maps regardless of the number of interpreted 
classes. No two independent expert assessments will 
interpret a given landscape in exactly the same fashion 
because there is no universally accepted standard or 
set of rules that define land-cover types. Even given a 
set of criteria, there will exist variability in land-cover 
interpretation due to human-based differences regard- 
less of interpreted media such as aerial photographs, 
hardcopy satellite image prints, or digital imagery. An- 
other and perhaps more abstract issue with using 


land-cover types as a basis for defining habitat is the 
assumption that fauna interpret or respond to the 
landscape in the same way as humans. Finally, land- 
cover classes do not represent temporal elements of a 
landscape. Temporal components might include length 
of photosynthetic productivity, timing of greenup or 
senescence, or duration of snowpack. These factors 
are not included in species-habitat assessments that 
are based on single-date derived land-cover maps. 
Wolff (1995) presents other criticisms of species-habi- 
tat association studies. He discusses ten points that ad- 
dress shortcomings of correlational analyses that in- 
clude caveats of short-term localized studies, lack of 
independence in data, lack of replication, and poten- 
tial differences in landscape interpretation by humans 
versus the individual animals. 

The situation becomes more complicated when 
more than one species is evaluated. Groups of similar 
species may aggregate on the landscape in a similar 
pattern but with overlap in range while other species 
from different taxa may exhibit very different patterns 
of distribution with or without overlap in range. The 
pattern-process mechanisms become increasingly com- 
plex and nested and the results of these analyses are 
difficult to interpret. Because of the complexity of in- 
teractions between pattern and process of multiple 
species analyses, standard statistical techniques again 
fall short. Independent variable interactions may not 
be linear or known a priori and therefore may be ex- 
cluded from a final empirical model. The assumption 
of independence remains violated. Finally, it is difficult 
to reconstruct spatial data once it has been despatial- 
ized, or reorganized into a structure for standard sta- 
tistical analysis. Therefore, one place to begin an 
analysis of the spatial pattern of multiple species with 
multiple interacting independent variables is with ex- 
ploratory data analysis. 


Exploratory Data Analysis: An Alternative 
to Standard Statistical Techniques 


Exploratory data analysis (EDA) is a means to quan- 
tify the inherent structure and variable interactions 
within a data set rather than forcing the data to fit a 
predefined or derived model. The fundamental philos- 
ophy is to use as much of the data as possible rather 
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than creating summaries (such as means) that discard 
data. Exploratory techniques that are data-driven, 
nonparametric, and computer-intensive are alterna- 
tives to traditional Gaussian models for quantifying 
nonlinear structures in data (Tukey 1977; Miller 
1994b), in contrast to Neyman-Pearson hypothesis 
testing. The computational intensity of large data sets 
at one time limited researchers' abilities for analysis, 
but with modern computers data volume is less of a 
limitation. With exploratory techniques, we can apply 
more-complicated statistical analyses to explore and 
describe data and to draw valid statistically based in- 
ferences on very large data sets (Efron and Tibshirani 
1991). Exploratory methods allow us to uncover and 
quantify the structure inherent in data, free of tradi- 
tional assumptions of normality. 

In Oregon, the process-pattern relationship be- 
tween vertebrate richness and environmental hetero- 
geneity was expected to have a hierarchical structure 
of complexity and for our study, interactions between 
variables were unknown. We were interested in quan- 
tifying this hierarchical structure inherent in the data 
to provide insight for understanding process-pattern 
relationships rather than fitting the data to a prede- 
fined curve. With the use of exploratory data analysis 
and graphical display tools, these complex structures 
were described and interpreted. 


Study Objectives 


This study investigated the relationships between ver- 
tebrate species richness of birds, mammals, reptiles, 
and amphibians, respectively, with vegetation phenol- 
ogy, terrain, and climate for the state of Oregon. We 
attempted to maximize the number of species investi- 
gated and to directly correlate vertebrate richness with 
meaningful spatial and temporal environmental vari- 
ables. Although the goal of this study was to deter- 
mine whether satellite imagery, digital elevation model 
(DEM) data, and climate variables could be used di- 
rectly to better explain observed pattern-process inter- 
actions of vertebrate species richness with the land- 
scape, we present in this chapter our methods for 
evaluating the predictive and explanatory models de- 
rived in this process. 


Data Compiled and Reviewed 


A total of twenty-seven variables were used in this 
analysis (Table 37.1). These variables included indices 
derived from 1992 satellite imagery, DEM variables, 
and climate variables. Descriptions of each of these 
variables are given in Table 37.2. Data were collected 
at two different grains: the 1-km? pixel data, including 
Advanced Very High Resolution Radiometer 
(AVHRR)-derived parameters and DEM variables, 
and climate and richness data at 646-km? hexagons. 
The grain of the study was 646-km? hexagons and 
pixel data were aggregated up to the hexagon scale. 
Two summary statistics, median and variance, were 
calculated for variables within each of the 434 hexa- 
gons in Oregon. Median was not calculated for aspect 
and median values were not available for climate vari- 
ables so mean values were substituted. Diversity, the 
number of different pixel values within a hexagon, 
was calculated for elevation, slope, and aspect as an 
additional measure of terrain variability. 

Species richness data was compiled on a statewide 
tessellation of 646-km2 hexagons by The Nature Con- 
servancy (TNC) for native mammals, breeding birds, 
reptiles, and amphibians, and were reviewed by ex- 
perts throughout the state (Master 1996). Each species 


TABLE 37.1. 


List of variables evaluated for correlational analyses with 
vertebrate diversity by taxa (birds, mammals, amphibians, and 


reptiles). 
ee Se EE 


Summary statistic PC Greenness DEM Climate 
Median and PC2 maxv elev 
variance 
PC3 tot slope 
onv 
sdn 
range 
sup 
Variance aspect seas 
Mean seas 
precip 
Diversity elev 
slope 
aspect 
Total 4 12 8 3 
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TABLE 37.2. 


Interpretation of variables for each of the metrics used to model vertebrate species richness by taxa. 


Variable Abbrev. Interpretation 

-Principal component 2 peo Seasonal pattern of phenology 

Principal component 3 Pea Nongrowing-season photosynthesis 

Maximum NDVI value during maxv Level of maximum photosynthetic activity 

growing season@ 

Total integrated NDVI@ tot Net primary production during the growing season 

NDVI value at start of growing season? onv Level of potential photosynthetic activity at beginning of growing season 
Rate of senescence? sdn Rate of senescence 

Range of NDVI@ range Range of annual photosynthetic activity 

Rate of greenup? sup Rate of greenup 

Elevation elev Elevation 

Slope slope Slope 

Aspect aspect Cardinal direction (aspect) 

Seasonal mean temperature difference seas Difference between monthly mean July and January temperatures 
Precipitation precip Rainfall 


aAdapted from Reed et al. (1994). 


was assigned an occurrence probability ranking for 
each hexagon. Recorded sighting and specimen collec- 
tion locations were registered within the hexagon grid. 
These hexagons were termed “confirmed” and as- 
signed a rank of 96-100 percent certainty that a given 
species occurred in that hexagon. The second ranking, 
“probable,” was defined as 80-95 percent confidence 
that a given species occurred in that hexagon. Based 
on habitat and expert opinion, a species was assigned 
to a hexagon with the “probable” ranking if there was 
not a recorded specimen in that hexagon. This analy- 
sis included species given either “confirmed” or 
“probable” ranking. 

The seasonal metrics derived from AVHRR time- 
series normalized difference vegetation index (NDVI) 
satellite imagery (Table 37.2) quantitatively character- 
ized seasonal phenological phenomena within the 
course of one year. The six metrics in this analysis 
were chosen based on phenological relevance, units of 
measure, and whether or not each was a continuous 
versus discrete measure. Each of these metrics quanti- 
fied a component of the annual NDVI curve over the 
course of one year thus capturing ecosystem dynamics 
(Reed et al. 1994). 

Principal component analysis (PCA) was conducted 
on 1992 time series NDVI data. PCA was calculated 


for twenty-one biweekly composites of National 
Oceanic and Atmospheric Administration (NOAA)- 
AVHRR 1-km?2 pixel data for the state of Oregon for 
the year 1992 to capture the greatest spatial and tem- 
poral variability in the twenty-one-scene data set. This 
type of analysis is well documented by Cicone and 
Olsenholler (1997), Eastman and Fulk (1993), Fung 
and LeDrew (1987), and Tucker et al. (1985). Princi- 
pal component 2 (PC2) represented seasonal vegeta- 
tion growth independent of the first principal compo- 
nent and accounted for 4.5 percent of the annual 
variation in NDVI. Principal component 3 (PC3) was 
interpreted to be baseline photosynthesis occurring in 
primarily coniferous-dominated or evergreen broad- 
leaf forests during the winter months and as nongrow- 
ing season vegetation characteristics. 

Terrain variables were calculated from 1-km? 
DEM obtained from the U.S. Geological Survey 
(USGS) EROS Data Center (EDC). Variables included 
were elevation (meters), slope (degrees), and aspect 
(degrees). Climate data were acquired for the state of 
Oregon as a subset from a larger database for the 
conterminous U.S. January and July temperatures and 
annual precipitation were compiled to a 1-km? rec- 
tangular grid. January and July mean temperature 
data were modeled and compiled using the method of 
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Marks (1990). Monthly averages of forty-year means, 
from approximately 1948 to 1988, were calculated at 
approximately 1,200 stations in the Historical Climate 
Network database. These values were first corrected to 
potential temperatures at a reference air pressure of 
1,000 Mb using the station elevations and assuming a 
normal adiabatic lapse rate. The potential tempera- 
tures were then interpolated to the 1-km? grid using a 
linear model. These interpolated values were then con- 
verted to estimated actual temperatures from the adia- 
batic lapse rate correction using corresponding eleva- 
tion values at each grid point. Annual precipitation 
data were compiled from the 10-kilometer resolution 
data to 1 kilometer (Daly et al. 1994). Mean precipita- 
tion and seasonal differences defined as (mean 
July-mean January temperatures) calculated on a per- 
pixel basis were summarized by hexagon as mean and 
variance. Seasonal difference and precipitation were 
also characterized by range, defined as (mean maxi- 
mum value-mean minimum value) within a hexagon. 

Data for the response, climate, and topographic 
predictor variables were assembled as part of a biodi- 
versity research program that investigated mechanisms 
influencing the distributions of biodiversity, objective 
methods for prioritization places for conservation of 
biodiversity, and consequences for biodiversity of fu- 
ture landscape change (White et al. 1999). 


Spatial Statistical Methods 


Two spatial statistical methods were used: semivari- 
ance and Moran's I. Semivariance (y) is a model of the 
average degree of similarity between observations as a 
function of distance (Rossi et al. 1992). Semivariance 
values range between 0, which indicates complete au- 
tocorrelation, and ee, which indicates complete ran- 
domness in the data. A semivariogram shows autocor- 
relation as a function of distance and when plotted 
represents spatial variability (Cohen et al. 1990; Le- 
gendre and Fortin 1989). The two parameters used to 
describe the spatial pattern of a data set from a semi- 
variogram are the sill, which is the value at which the 
curve levels off, and the range, which is the lag dis- 
tance corresponding to the sill. Moran's I, I(d), is a 
spatial autocorrelation coefficient that indicates signif- 
icant patch size pattern in spatial data. The two meth- 


ods are complementary for evaluating the spatial 
structure of autocorrelated data. Values of I(d) range 
between -1 and 1 with positive values corresponding 
to positive autocorrelation, zero indicating random- 
ness, and negative values representing negative auto- 
correlation. Positive significant values indicate similar- 
ity at the scale of the lag distance, or distance between 
pairs of points. Negative significant values show the 
distance between peaks and troughs. Spatial richness 
patterns are thus indicated by a series of significant 
positive and negative values (Legendre and Fortin 
1989). Standardized semivariance and Moran's I with 
95 percent confidence intervals were calculated for 
each of the taxa and for residuals from each of the re- 
gression tree models. The 95 percent confidence inter- 
val computed at each lag distance was a test for signif- 
icance where the null hypothesis was that the 
coefficient at a given lag distance was not significantly 
different from zero. 


Exploratory Statistics—Classification 
and Regression Tree Analysis 


Classification and regression tree analysis (CART) is a 
tree-based exploratory data analysis method that has 
been shown to be effective in identifying and estimat- 
ing complex hierarchical relationships in multivariate 
data such as satellite-derived indices, DEM, and other 
digital data (Rathert et al. 1999; Michaelsen et al. 
1994). The data set is repeatedly partitioned into ho- 
mogenous subsets using binary recursive partitioning 
until the entire data set has been evaluated. Node 
splits are determined by deviance, where a split occurs 
at a given node such that the change in deviance is 
maximized and the variance of all resulting subsets is 
minimized. CART is useful where there are expected 
but unknown nonlinear or nonadditive predictor-re- 
sponse interactions (Michaelsen et al. 1994). Regres- 
sion tree analysis is a data-driven, nonparametric, 
computer-intensive method (Miller 1994b) that selects 
variables and values for splitting which best discrimi- 
nate among responses (Efron and Tibshirani 1991). 
Because this method results in overfit models, meaning 
all data are categorized, regression trees must be 
pruned to select the most parsimonious subset that 
best explains the relationship between response and 
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predictors. Cross-validation was used to select sub- 
trees that have optimal predictive performance with 
lowest complexity (Breiman et al. 1984; Clark and 
Pregibon 1992). In a cross-validation, regression trees 
are computed on nine-tenths of the data and are 
checked with the one-tenth of the data withheld 
(Miller 1994b). Three cross-validation methods were 
used to determine the appropriate number of terminal 
nodes for the pruned trees: cost-complexity pruning, 
one standard error rule (one SE), and adjusted mini- 
mum risk (AMR). Each method was run on ten itera- 
tions where the suggested number of terminal nodes 
was calculated from each method and the complexity 
penalty was 0.01. Cross validation was run ten times 
for each of the five taxa. For each taxon, the median 
value of the ten cross validations was calculated. Tree 
length was based on median number of suggested 
nodes from the ten iterations. 

Analyses resulted in numeric and graphical means 
for evaluating final CART models. Final models from 
each CART were indicated node splits, values at node 
splits, predicted average number of species per hexa- 
gon at terminal nodes, and number of hexagons at 
each terminal node. Semivariance and Moran's I were 
calculated on residuals from each CART model by 
taxa as a means to evaluate randomness in residuals 
similar to using a residual plot to check the fit of a re- 
gression model. In this manner, we integrated nonspa- 
tial statistical methods with spatial analyses as a check 
on how well the regression trees fit the data and to de- 
termine how well the models dealt with autocorre- 
lated richness data. 


CART Results for Predicting Richness 


CART explained between 55 and 91 percent of the 
variation in taxonomic richness for vertebrates in Ore- 


TABLE 37.3. 


gon. Table 37.3 shows deviance explained—a measure 
similar to the familiar R2 value in standard statistics— 
for each regression tree model by taxa and lists the op- 
timal number of terminal nodes for each tree after 
pruning. Semivariance and Moran's I were calculated 
for the richness data by taxa and for the residuals of 
the regression tree models. This served to check how 
well the regression trees fit the data and to determine 
how well the individual models dealt with autocorre- 
lated richness data. Like residual plots from classical 
regression methods, no pattern was expected from 
models that fit the data well. Spatial plots of residual 
values for regression tree models exhibited no pattern. 
Results from the spatial statistics on both the original 
data and residuals indicated that there was significant 
spatial autocorrelation to the vertebrate data but that 
CART was able to effectively model correlative rela- 
tionships despite autocorrelation in the original data. 

The semivariograms and corresponding correlo- 
grams for the spatial pattern of mammal, bird, reptile, 
and amphibian distributions, respectively, are shown 
in Figures 37.2-37.5. Semivariance and Moran's I cal- 
culated on the residuals (the difference between the ac- 
tual species richness and predicted species richness) 
are also shown in Figures 37.2-37.5. As with residual 
plots from classical regression methods, no pattern 
was expected from models that fit the data well. 

For mammals (Fig. 37.2), there was significant spa- 
tial autocorrelation in the data to a lag of 180 kilome- 
ters, or an approximate neighborhood of six hexa- 
gons. A secondary peak in autocorrelation appeared 
at longer lags. Moran's I was significant and positive 
to a lag of 60 kilometers, a distance about equal to 
two hexagon center-to-center distances. At the maxi- 
mum lag distance, there was some indication of signif- 
icant negative autocorrelation. The range of the semi- 
variogram for mammal CART residuals was 120 


Deviance explained for regression tree models by taxa and the number of 


terminal nodes for each data set. 


Mammals Birds Reptiles Amphibians 
Deviance explained 0.55 0.67 (05) 0.91 
No. terminal nodes 31 6 9 Ey 
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Figure 37.2. Semivariogram (A) and correlogram (C) 
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for mammal richness and semivariogram (B) and 


correlogram (D) for residuals of mammal CART model. Dashed lines on correlograms indicate 9596 confi- 
dence intervals. The fitted line on semivariograms is a spherical fitted model based on estimates using 


nonlinear least squares. 


kilometers and the correlogram indicated a weak pos- 
itive significance at a lag equivalent to one hexagon 
center-to-center distance. 

Bird richness (Fig. 37.3) exhibited spatial autocor- 
relation to 340 kilometers, or nearly twelve adjacent 
hexagons. Moran's I coefficients were significant and 
positive to a lag of 100 kilometers. The residuals from 
the bird CART semivariogram and correlogram 
showed autocorrelation to lag distances of 200 kilo- 
meters and 27 kilometers, respectively. Some coeffi- 
cients on the correlogram were weakly and negatively 
significant at a 95 percent confidence interval. This 
was not apparent in the correlogram of bird CART 
model residuals. | 

Semivariance of reptile richness (Fig. 37.4) had a 
range of 240 kilometers and Moran’s I coefficients 
were positive and significant to a lag of 80 kilometers, 
or three adjacent hexagons. The CART residual semi- 
variogram had a range of 140 kilometers. The corre- 


sponding correlogram indicated CART residuals were 
positive and significant to one adjacent hexagon. The 
increasing trend in semivariance, which was apparent 
in the richness data, was not seen in the semivari- 
ogram of the reptile CART model residuals. Moran's I 
coefficients for reptile residuals were consistently 
closer to 0 than the coefficients for the corresponding 
reptile richness data. 

The amphibian semivariogram in Figure 37.5 
shows a significant positive correlation to a lag of 260 
kilometers and significant negative correlations begin 
at a lag of 320 kilometers. The semivariogram and 
correlogram of CART residuals for amphibians 
showed no indication of spatial trend. The correlo- 
gram indicated significant positive correlation at 
smallest lag distances of 27 kilometers. A detailed ex- 
planation of the difference between the amphibian 


semivariogram and those of the other taxa follows. 
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Figure 37.3. Semivariogram (A) and correlogram (C) for bird richness and semivariogram (B) and correlo- 
gram (D) for residuals of bird CART model. Dashed lines on correlograms indicate 9596 confidence inter- 
vals. The fitted line on semivariograms is a spherical fitted model based on estimates using nonlinear 


least squares. 


Discussion 


We used semivariance and Moran's I in the same way 
that residual plots are evaluated in regression analysis: 
as a means of determining goodness of fit for our 
CART models. If the models fit well, we did not ex- 
pect to see spatial autocorrelation in the semivari- 
ograms or correlograms of residuals. We did not ex- 
pect to see a pattern because the over- or 
underestimation of the number of species in each 
hexagonal cell should not be systematic. We did ex- 
pect to see a spatial pattern in the species richness data 
because ecological data are autocorrelated. Where 
CART excelled over standard regression modeling 
techniques was in the ability to make richness predic- 
tions based on spatially autocorrelated data that also 
exhibited nonlinear interactions. So, the use of CART 
as an exploratory method and as a predictive model- 
ing tool combined with a goodness-of-fit test adapted 


for spatial data was found to be successful. The suc- 
cess of CART to explain the spatial variance of verte- 
brate richness in Oregon showed promise but was 
somewhat dependent on the underlying spatial pattern 
of the dependent data (richness). 

The spatial autocorrelation of the species data itself 
presented an interesting challenge for statistical fitting, 
and CART effectively modeled much of the spatial 
variability and autocorrelation of the original data. For 
any given explanatory variable by hexagon the value in 
a neighboring hexagon was expected to have a similar 
value. In most cases, CART models captured this spa- 
tial variability indicated by the fact that model residu- 
als reflected low level noise and were spatially random. 
This was supported by the deviance explained, in com- 
bination with interpreted semivariograms and correlo- 
grams of the individual taxonomic distributions. All 
taxa had significant positive autocorrelation and 
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Figure 37.4. Semivariogram (A) and correlogram (C) for reptile richness and semivariogram (B) and cor- 
relogram (D) for residuals of reptile CART model. Dashed lines on correlograms indicate 9596 confi- 
dence intervals. The fitted line on semivariograms is a spherical fitted model based on estimates using 


nonlinear least squares. 


CART effectively modeled these groups evidenced by 
random semivariograms and nonsignificant values of 
Moran's I of model residuals. Statewide amphibian dis- 
tribution, for example, was the simplest pattern, had 
the most variability accounted for in the final model 
(0.91), and had a random spatial residual plot. Mam- 
mal and reptile distributions exhibited the most com- 
plex spatial patterns and were likewise the most diffi- 
cult to model. CART explained a little over half of the 
variability in mammal (0.55) and reptile (0.57) rich- 
ness data. Unlike the amphibian distribution, mam- 
mals exhibited correlation at two different lag dis- 
tances and, as a result, were the most difficult pattern 
to explain. The two different lag distances are apparent 
in the shape of the semivariogram. 

Across taxa, results from semivariance and Moran's 
I indicated the CART methods reduced significant au- 
tocorrelation in the richness data, but there still ex- 
isted some pattern to the residuals at shortest lag dis- 


tances. Taxa with more complex spatial patterns were 
more difficult to model and this was reflected in corre- 
sponding lower values of deviance explained. The de- 
viance explained for birds (0.67) was higher than for 
reptiles (0.57), and likewise reptiles had a more signif- 
icant positive value of Moran's I for short lag dis- 
tances. The case of amphibians is different because the 
pattern of amphibian distribution was not one of het- 
erogeneous richness patterns like the other taxa. 

The amphibian semivariogram is unlike those for 
the other taxa and was fit with an exponential model. 
This type of semivariogram does not have a sill and 
likewise lacks a range. Exponential semivariograms 
indicate a spatial gradient, which in the case of am- 
phibians in Oregon runs east-west, with lowest rich- 
ness in the east and highest richness in the west. The 
existence of a gradient is also supported by the signifi- 
cant positive and negative coefficients in the corre- 
sponding correlogram. Despite this strong gradient 
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Figure 37.5. Semivariogram (A) and correlogram (C) for amphibian richness richness and semiovari- 
ogram (B) and correlogram (D) for residuals of amphibian CART model. Dashed lines on correlograms 
indicate 9596 confidence intervals. The fitted line on semivariograms is a spherical fitted mode! based 


on estimates using nonlinear least squares. 


pattern in amphibian richness, the CART residuals did 
not show a spatial pattern. 

An argument could be made that, because the un- 
derlying richness data and independent variables have 
spatial dependence, we should expect to see a corre- 
sponding pattern in the residuals. The existence of 
such a pattern in residuals would indicate systematic 
error. Semivariance and I(d) indicate spatial depend- 
ence across distance, that is, how similar neighboring 
values are at regular distance intervals. A significant 
pattern in residuals indicates that a model consistently 
over- or underpredicts species richness in a spatially 
dependent manner. Therefore, pattern in residuals 
from the CART models is an indication of some sort 
of bias, such as inoptimal sample- or grid-cell size. 
There is evidence in our results to indicate our hexa- 
gon size was not optimal, seen for all taxa, in that the 
semivariance curves did not pass through zero. This 
nugget effect indicates either measurement error or 
variation at a finer grain size. The results of the spatial 


statistics calculated on the amphibian residuals pro- 
vide the strongest evidence to support both the results 
of the CART models and our method for evaluating 
the goodness of fit of those models. Amphibian rich- 
ness had the most prominent spatial pattern and the 
highest deviance explained (0.91), and the analysis of 
residuals showed no spatial pattern. 


Expert Judgment Required When 
Selecting a Statistical Model 


The results from this study did not imply cause-and- 
effect relationships but did suggest relationships be- 
tween diversity of taxa and specific phenologic, cli- 
matic, and topographically related forces. It is 
impossible to determine what factor(s) directly caused 
the patterns of diversity on the landscape today for 
several reasons, the most obvious being that we are re- 
stricted to retrospective studies given the relatively 
short life span of humans on an evolutionary time 
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scale. We are also somewhat encumbered by our need 
to aggregate and characterize, evidenced in this study 
by the fact that none of the semivariograms of the 
taxa passed through the origin. This indicated an un- 
detected pattern at a different spatial scale due to in- 
appropriate sample size. Regardless of grain and ex- 
tent, there will always be interpretation of the initial 
response variable (i.e. number of species per given 
area) and the explanatory variables (i.e., interpolated 
climate or averaged reflectance). These interpretations 
may create artifacts before any analysis is conducted. 
To effectively manage and interpret data and results 
from biodiversity analyses, however, we must aggre- 
gate, interpolate, and categorize. 

This study not only showed some general trends 
that support findings of previous work, it also added 
new insight because of the methods used to assess suit- 
ability of the regression trees for predicting vertebrate 
richness. The development of spatial statistics has cre- 
ated tremendous opportunity for researchers to quan- 
tify and explain patterns in nature, but these new tools 
must be used with caution and thoughtfulness. With 
these new methods, researchers must question more 


than just the ecological or statistical significance of re- 
sults. When expert judgment is required to select “the 
best" result, subjective interpretation plays an increas- 
ingly important role. Ecological interpretation of two 
closely *best" fitting models may be very different. We 
must have faith in our own knowledge to select the 
model that is closest to *truth." The use of parametric 
measures as a substitute for classified land-cover data 
for modeling vertebrate species richness may allow us 
to better relate patterns of diversity with correlated 
processes. In this manner, we have minimized subjec- 
tive interpretation in terms of landscape function and 
habitat distribution. 
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Ranked Modeling of Small Mammals 
Based on Capture Data 


Vickie J. Smith and Jonathan A. Jenks 


k models can be used to predict 
species occurrence and to map distributions (Scott 
et al. 1993; Krohn 1996; Smith and Catanzaro 1996). 
These models are based on habitats used by the ani- 
mal and can be generated by season or year. When de- 
veloping these models, all habitats are weighted 
equally and assumed to contain similar densities of 
species. For ubiquitous species occupying a number of 
habitat types, presence/absence models may be less 
useful for predicting habitats that are selected or 
avoided. Under these conditions, ranked models may 
provide a more accurate depiction of species occur- 
rence and distribution than presence/absence models. 
Ranked models are formulated from capture data and 
habitat types are ranked, or weighted, based on higher 
capture densities in various habitat types. 

The objective of the study described in this chapter 
was to compare presence/absence models to ranked 
models that account for variation in the number of 
small mammals captured within habitats. A more ac- 
curate depiction of habitat selection ranked by species 
density would enhance knowledge of distribution pat- 
terns and conservation status. 

Although this study focuses on the use of capture 
data to assess the abundance of species in given habi- 
tat types, habitat quality is difficult to determine from 
trapping data without a study encompassing all sea- 
sons and reproductive success associated with habitat 


types. To accurately assess habitat quality, extensive 
studies must be conducted that would measure repro- 
ductive state. Sources may occur where a high-density 
population has produced a large number of offspring. 
These offspring are forced into less-suitable habitat, 
where they occur at relatively high densities. In this 
case, a large number of individuals would not indicate 
high-quality habitat (Van Horne 1983). In spite of this 
knowledge, we have predicted occurrences based on 
the density of species in given habitat types. Reference 
to high-density habitats does not necessarily imply 
high-quality habitat. 


Study Area and Methods 


To generate ranked models, vegetation communities 
were sampled throughout eastern South Dakota, 
which consisted mostly of state-owned game produc- 
tion areas (GPA) and federally owned waterfowl pro- 
duction areas (WPA). These public areas contain rem- 
nant patches of mixed-grass prairie characterized by 
western wheatgrass (Agropyron smithii), needle-and- 
thread (Stipa comata), and sideoats grama (Bouteloua 
curtipendula) and are invaded by smooth brome (Bro- 
mus inermis). However, agricultural fields (i.e., corn, 
soybeans, wheat) dominate the landscape. Wetlands 
and shelterbelts also are common (Luttschwager et al. 
1994). 

Two satellite images from April 1992 were 
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obtained from the Multi-resolution Land Characteris- 
tics Consortium (MRLC) for path 30, row 30 (scene 
7) of south-central South Dakota and georeferenced at 
EROS Data Center, Sioux Falls, South Dakota. Se- 
lected scenes were chosen based on availability, clarity, 
and low percent cloud cover. Agricultural lands and 
wetlands were masked from the scene based on an un- 
supervised classification (Vogelmann et al. 1998) and 
accuracy assessment. An unsupervised classification 
was performed on perennial vegetation resulting in 
one hundred clusters. Clusters were evaluated using 
known land cover to perform a supervised classifica- 
tion throughout the remaining unclassified pixels in 
the scene. Once general land-cover categories were de- 
termined, digital coverages of ancillary data, such as 
elevation and soils, were used to reduce confusion 
among clusters (Lauver and Whistler 1993; Egbert et 
al. 1998; Vogelmann et al. 1998). 

Using IMAGINE software (ERDAS, Atlanta, Ga.), 
land-cover types from the general classification of 
Scene 7 were recoded into the appropriate ranked cat- 
egory (1-3). National Wetlands Inventory (NWI) 
basins were buffered by 90 meters. If a species oc- 
curred in wetlands, this coverage was added to the 
ranked land-cover coverage, resulting in a ranked pre- 
diction of density. 

Small mammals were captured using snap traps 
(Woodstream Museum Special and Victor) and live 
traps (Woodstream Hav-a-Hart and Sherman) through- 
out eastern South Dakota from 16 May to 28 August 
1998. Four trap lines were set per week and checked 
each morning for three days. Trap lines contained four 
to five traps (various combinations of live and snap 
traps) at each of twenty-five stations. Traps were baited 
with a mixture of oatmeal and peanut butter. Sampling 
areas were chosen based on the Environmental Protec- 
tion Agency’s (EPA) Environmental Monitoring and As- 
sessment Program (EMAP) hexagons (Csuti and Crist 
2000). GPAs and WPAs were chosen within the hexa- 
gons and traplines placed in wetlands, pasturelands, 
shelterbelts, deciduous trees, and grasslands. 


Presence/Absence Models 


National GAP standards (Csuti and Crist 2000) for 
creating presence/absence models were applied for the 
three species using literature on habitat use (deer 


mouse [Peromyscus maniculatus] [Rumble 1982; 
Forde et al. 1984; Hull-Sieg et al. 1984], meadow vole 
[Microtus pennsylvanicus] [Lokemoen and Duebbert 
1976; Wilhelm et al. 1981], and arctic shrew [Sorex 
arcticus] [Gruebele and Steuter 1988]) and available 
capture information. Presence/absence models were 
created for the deer mouse, meadow vole, and arctic 
shrew, giving equal weight to each of the habitats 
within which the species was captured. Using IMAG- 
INE software, each habitat type was assigned a 1 if 
the species was present and a 0 if the species was ab- 
sent. Species that occurred within wetland habitat 
types were assigned a 1 inside the buffered wetland 
coverage. However, if a given pixel in the coverage 
contained suitable habitat in the land-cover image, the 
wetland coverage, or both, it was assigned a code of 1 
(for presence). 


Ranked Models 


Capture data for three small mammal species, the deer 
mouse, meadow vole, and arctic shrew, were evalu- 
ated for each habitat type within trap lines to deter- 
mine species density. Habitat types included wetlands, 
pasture, shelterbelts, deciduous trees, grasslands, and 
Conservation Reserve Program (CRP) lands. To allow 
comparison between habitats, the number of small 
mammals captured in each habitat was divided by 
trap nights and multiplied by one thousand (captures 
per thousand trap nights). A 1 was assigned to the 
habitat type with the lowest capture rate for each 
species. Ratios between the other capture rates were 
calculated for the remaining habitat types in which 
species were captured. These ratios were ranked from 
1 to 3 (3 being the highest) to differentiate between 
capture abundance (Table 38.1). 


Comparison of Models 


Pixels contained in each category were converted to 
hectares for analysis. Models were compared by sub- 
tracting the value of each pixel of the presence/absence 
model from the ranked model. If no difference oc- 
curred between the two models, both methods were 
effective at determining density. Comparisons of 
hectares present in each rank may determine the areas 
with greater density for each species. 
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TABLE 38.1. 


Captures per thousand trap-nights and rank (in parenthesis) by habitat type for deer mouse (Peromyscus maniculatus), meadow 


vole (Microtus pennsylvanicus), and arctic shrew (Sorex arcticus). 


Deciduous Conservation reserve 
Species Wetlands Pasture Shelterbelts trees Grasslands program 
Deer mouse 15.0 ‘oon 24.1 1449 18.2 13.3 
(2) (1) (3) (2) (3) (2) 
Meadow vole 4.4 0.0 7.8 11.9 Toll 10.4 
(1) (0) (2) (3) (2) (3) 
Arctic shrew 0.0 0.0 03 0.0 0.3 0.0 
(0) (0) (1) (0) (1) (0) 
Results Ranked Models 


Presence/Absence Models 


Presence/absence models for the deer mouse and the 
meadow vole gave similar results (see Figs. 38.1 and 
38.2 in color section). Both species used all habitats 
(grassland, including pasture and CRP [Rumble 1982; 
Forde et al. 1984; Agnew et al. 1986; Apa et al. 
1991], shelterbelts [Hull-Sieg et al. 1984; Hodorff et 
al. 1988], and wetlands [Wilhelm et al. 1981]) except 
agriculture and were present in 900,708 hectares 
(60.3 percent of the 1,493,554-hectare area in Scene 
7). The arctic shrew occurred in grassland, including 
pasture and CRP (Gruebele and Steuter 1988) and 
was present on 230,213 hectares (15.4 percent of total 
area in Scene 7) (Fig. 38.3). 


Legend 
Present B 


Absent ie 


Ranked models for the deer mouse, meadow vole, and 
arctic shrew predicted the use of 900,708 hectares, 
591,359 hectares, and 230,213 hectares of land, re- 
spectively (Table 38.2). Each model was unique based 
on the ranked habitat data for that species. Density 
ranks were calculated individually for each species. 
The deer mouse model contained five density ranks 
(with 5 having the highest predicted density) (Fig. 
38.1). The habitat with the lowest density included 
301,675 hectares (33.5 percent) and the habitat with 
the highest density included 72,421 hectares (8.0 per- 
cent) of habitat (Table 38.2). The meadow vole model 
contained four density ranks (with 4 having the high- 
est predicted density) (Fig. 38.2). The habitat with 
the lowest density included 355,391 hectares (60.1 


Figure 38.3. Arctic shrew (Sorex arcticus) presence/absence model (left) and ranked model (right) produce the same results. 
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TABLE 38.2. 


Hectares present in ranked suitability models for deer mouse (Peromyscus maniculatus), meadow vole 
(Microtus pennsylvanicus), and arctic shrew (Sorex arcticus). 


Deer mouse Meadow vole Arctic shrew 
Rank Hectares Percent Hectares Percent Hectares Percent 
dl 301,674.77 33.5 355,391.0 60.1. 230,212.8 100.0 
2 249,397.2 2. 22,062.1 3m 
3 272,898.1 30.3 150,597.5 25.5 
4 A 316:5 0:5 63,308.5 10.7 
5 UA AOL AL 8.0 
Total 900,707.6 100.0 591,359.1 100.0 230,212.8 100.0 


percent) and the habitat with the highest density in- 
cluded 63,309 hectares (10.7 percent) of habitat 
(Table 38.2). The arctic shrew model included one 
density rank (Fig. 38.3); this model included 230,213 
hectares of habitat (Table 38.2). 

In Scene 7, deer mice used 60.3 percent of the 
available habitat. Additionally, 38.8 percent of that 
habitat was classed as average to above average based 
on density (rank 3-5). Meadow voles used 39.6 per- 
cent of the available habitat in Scene 7. Of that, 36.2 
percent was classed as above average based on density 
(rank 3—4). In Scene 7, arctic shrews utilized 15.4 per- 
cent of the available habitat. Only one habitat cate- 
gory was present in this model. 


Comparison of Models 


Presence/absence and ranked models for deer mice in- 
cluded the same area (900,708 hectares) of Scene 7, al- 
though differences occurred between the density rank- 
ings. The ranked model predicted a higher density in 
66.5 percent of the area, or 599,033 hectares. Thus, our 
results indicated that ranked models might be more ap- 
propriate for predicting occurrences of deer mice. 

The meadow vole presence/absence model included 
more area than the ranked model. This was caused by 
a discrepancy between information found in Loke- 
moen and Duebbert (1976) and our trapping informa- 
tion. Lokemoen and Duebbert (1976) reported use of 
grasslands by meadow voles, including pasture. How- 
ever, meadow voles were not captured in pasture in 
our study, and, therefore, this habitat was not in- 
cluded in our model. The presence/absence model in- 


cluded 900,708 hectares of suitable habitat. The 
ranked model predicted a higher density on 236,927 
hectares, or 26.3 percent of the presence/absence 
model. Our results indicated that ranked models 
might be more appropriate for predicting occurrences 
of the meadow vole. 

For the arctic shrew, presence/absence models and 
ranked models included 230,213 hectares. No differ- 
ences were found between the two models. Arctic 
shrews were captured in low densities (0.3 individuals 
per thousand trap nights) in grasslands and shelter- 
belts, and, therefore, were given a rank of 1 for both 
habitat types. Results of this model did not differ from 
the presence/absence model. Therefore, both models 
were considered effective at predicting arctic shrew 
Occurrence. 


Discussion 


Ranked density models were most informative when 
modeling habitat relationships for ubiquitous species. 
Species that occur in every habitat available at varying 
densities may be difficult to manage because the 
species’ exact requirements may not be known. 
Ranked models can focus attention on habitats with 
higher or lower density, depending on the objective of 
the study or management. Management can be con- 
ducted at the local level to increase habitat quality in 
areas of low density or to maintain habitat quality 
in areas of high density. Conversely, presence/absence 
models were similar to ranked models for rare or low- 
density species. When a species occurs in only one or 
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two habitat types, such as arctic shrews, one of the 
two available management scenarios may be restora- 
tion or maintenance of current habitat. 

Competition for resources, such as food and living 
space, may limit the density of one species in the pres- 
ence of another. Because our objective was to compare 
modeling results from presence/absence and ranked 
models, we attempted to minimize competition bias. 
Therefore, we reported results for three species, an 
herbivore (meadow vole), omnivore (deer mouse), and 
an insectivore (arctic shrew). Thus, for our study, 
competition among those species was assumed to be 
negligible. Nevertheless, presence of other small mam- 
mals not included in our study likely affected species 
distributions. 

Due to a lack of information on small mammals in 
South Dakota, models could not be assessed for accu- 
racy. Nonetheless, our objective was to illustrate dif- 
ferences in distributions of small mammals resulting 
from the presence/absence and density information. 
However, information from ranked models can give 
light to some issues of accuracy. For example, errors 
of omission (failure to capture a species where it was 


predicted to occur) is an issue when determining pres- 
ence or absence of a species. Ranked models can help 
reduce errors of omission by determining the suitabil- 
ity of the habitat before trapping and by determining 
the number of trap nights necessary to capture a 
species. Areas with lower densities of a given species 
may require additional trapping to detect presence in 
that area. High-density species may require as few as 
fifty trap nights to detect presence. Consequently, 
ranked models can reduce time spent in high-density 
areas and focus effort on areas with relatively low de- 
tection probabilities. 
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Calibration Methodology for an Individual- 
based, Spatially Explicit Simulation Model: 
Case Study of White-tailed Deer 

in the Florida Everglades - 


Christine S. Hartless, Ronald E. Labisky, and Kenneth M. Portier 


E simulation models have been impor- 
tant tools in wildlife ecology and conservation, 
and their use continues to increase. Simulations enable 
scientists to model the effects of environmental catas- 
trophes or management strategies on target popula- 
tions without conducting expensive or difficult experi- 
ments. Early population models ranged from simple 
models, such as logistic growth (Pearl and Reed 1920; 
Renshaw 1990) to stage- or age-based matrix models 
(Leslie 1945; Lefkovitch 1965). More recent popula- 
tion models have incorporated a spatial component, 
such as spatial dispersion models (Skellam 1951) and 
metapopulation models (Levins 1969; Hanski and 
Gilpin 1997). In the continuing evolution of simula- 
tion models, individual-based models—those using the 
individual as the basic unit (DeAngelis and Gross 
1992; Grimm 1999)—are the current state of the art 
and represent a shift toward more mechanistic models 
(Maurer, Chapter 9). Within this large class of models, 
individual-based spatially explicit (IBSE) models are 
used to simulate individual movement processes and 
interactions over a heterogeneous landscape. This 
ability to model interactions among individuals and 
interactions between individuals and their environ- 
ment has provided insight into many ecological 
processes (Huston et al. 1988). As computing power 
and speed increase, and computer costs decrease, 
more-complex IBSE models become feasible. The in- 


creasing reliance of management decisions on simula- 
tion models drives the necessity for development of 
tools to adequately calibrate and validate these 
models. 

Most individual-based models incorporate a large 
number of parameters (Grimm 1999). Some model 
parameters, such as the number of offspring or the 
survival rate, are estimated from published values. In 
contrast, parameters characterizing movement pat- 
terns of individuals in a simulation rarely can be esti- 
mated from published values or even derived from 
other measurable variables on study animals. These 
parameters (e.g., the distance an individual can “see” 
when making a movement decision, or the effect of 
previous movements on the current movement deci- 
sion) are either difficult or impossible to estimate. 
However, the best-fitting movement algorithms and 
associated parameter values can be determined by 
evaluating discrepancies in measured outcomes (e.g., 
home range size) between study animals and simu- 
lated animals. 

Bart (1995a) and Conroy et al. (1995) suggest 
guidelines for model development and testing that 
include the need for clearly stated model objectives, a 
description of the model structure, and a sensitivity 
analysis to assess effects of parameter uncertainty on 
model outputs. In addition, model development also 
requires verification, calibration, and validation. As de- 
fined by Rykiel (1996), verification is a demonstration 
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that the model form is correct; calibration is the esti- 
mation and adjustment of model parameters to im- 
prove agreement between model output and observed 
data; and validation is a demonstration that the model 
output possesses the accuracy required for intended 
applications. 

Although an individual-based model ought to be 
more testable than a state-variable model (i.e., a 
model with a population, community, or ecosystem as 
the basic unit; Murdoch et al. 1992), only 36 percent 
of the fifty individual-based modeling papers reviewed 
by Grimm (1999) explicitly discussed validation or 
corroboration of the presented models. Statistical 
tools for rigorous calibration and validation of simu- 
lation models do exist, and they generally fall into two 
groups. Some are described using ecological process 
and population models (e.g., Van der Molen and Pin- 
tér 1993; Rykiel 1996), and others are described in 
operations research and industrial engineering settings 
(e.g., Sargent 1984; Kleijnen 1987). In this chapter, 
statistical tools that aid in model verification and cali- 
bration in the context of IBSE models are presented. 
These tools are demonstrated with the calibration of a 
simulation model of the movement patterns of white- 
tailed deer (Odocoileus virginianus seminolus) in the 
Florida Everglades. 


Methods 


Determining correct model form (verification) involves 
evaluating the conceptual structure and the transfor- 
mation of the structure into computer algorithms. 
Model structure is often visualized through flow 
charts. Detailed literature review and analyses of addi- 
tional data aid in the verification of correct conceptual 
model structure. Assuming the model structure is cor- 
rect, the calibration process consists of altering pa- 
rameter values until the modeled system is represented 
adequately. Calibration also may reveal algorithms in 
the simulation model that need further modification if 
the optimum parameterization of the algorithm is not 
sufficient. Updating algorithms and parameter values 
is an iterative process requiring constant reevaluation 
of the simulation model. 

The first issue to address in model calibration was 
the amount of time (i.e., number of iterations) the sim- 


ulation must run before evaluation becomes meaning- 
ful. A second evaluation issue, quantitative compari- 
son of the simulated data and observed data, was ad- 
dressed with discrepancy measures. A third issue, 
optimization of simulation parameter settings, was ac- 
complished by conducting experiments using the de- 
signs and iterative techniques of response surface 
methodology. Qualitative evaluation (e.g., visual com- 
parison) was also an important component of model 
calibration. 


Simulation Burn-in Time 


The burn-in period, an artifact of computer simula- 
tion, was defined as the number of iterations required 
for the simulation to reach a steady state (Kleijnen 
1987). Estimation of burn-in time was accomplished 
by allowing the simulation to run for extended periods 
of time and examining temporal trends and autocorre- 
lations of simulation outcomes (e.g., annual home 
range size in an animal movement model). To avoid 
the confounding of burn-in and temporal algorithms 
and parameters, simulation burn-in time was evalu- 
ated before and after temporal effects were included in 
the simulation. 

To estimate burn-in time, a simulation was run for 
an extended period of time and summary outcomes 
were calculated for all time intervals for each individ- 
ual in the simulation (i.e., annual summary outcomes 
for a simulation run for fifteen years). A repeated 
measures analysis was performed to identify signifi- 
cant time trends in the summary outcome (Diggle et 
al. 1994; Littell et al. 1996; Vonesh and Chinchilli 
1997). If no significant linear time trend was present, 
burn-in time did not affect that particular outcome. A 
significant linear time trend was evidence that simula- 
tion burn-in time did affect the outcome. If that oc- 
curred, the test for a linear trend was repeated using 
all intervals except the first. If the second test for time 
trend was not statistically significant, burn-in time 
was established at one interval; otherwise, the test for 
a linear trend was repeated excluding the first and sec- 
ond time intervals. These steps continued until the 
time trend in the remaining intervals was no longer 
significant, indicating simulation burn-in was com- 
pleted. If the simulation did not reach a steady state 
until the end of the simulated time period or never 


reached a steady state (i.e., simulation burn-in time 
was equal to or longer than the time period of the sim- 
ulation), further exploration of burn-in time and the 
simulation algorithms was necessary. 


Discrepancy Measures 


Discrepancy measures (DMs) quantify the difference 
between a simulated data set and an observed data set 
(Van der Molen and Pintér 1993). The general form of 
the discrepancy measure was 


(39-1) 


where P(x) was a summary statistic for a run of the 
simulation with parameter set x, x was an element of 
X (the set of all feasible model parameters), and O 
was the same summary statistic computed for the ob- 
served data. Examples of summary statistics for IBSE 
simulations of animal movement were mean home 
range size or the mean percentage of time individuals 
were located in a specific habitat. 
One family of discrepancy measures had the form: 


D(x) = IP(x) - O18 (39:2) 


where 1 < B < œ. For f = 1, D(x) was the absolute de- 
viation between simulated and observed data, and for 
B = 2, D(x) was the squared deviation between simu- 
lated and observed data. Evaluation of the summary 
statistics such as mean annual home range size or 
mean distance between consecutive home range cen- 
ters was accomplished with this family of DMs. 

A discrepancy function useful for evaluating a set 
of n dependent outcomes, such as the percentage of 
observations (i.e., radio locations) in different habi- 
tats, had the form 


(39.3) 
i=1 L 
where O; was the summary statistic for the observed 
data for the i? outcome, P;(x) was the i^ summary 
statistic for a run of the simulation with parameter set 
x, and x was an element of X (the set of all feasible 
model parameters). This DM approximated a chi- 
square goodness-of-fit statistic with n-1 degrees of 
freedom, where P(x) and O; were the percentage of 
locations in habitat i based on simulated and field 
data, respectively. Mayer and Butler (1993), Power 
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(1993), and Van der Molen and Pintér (1993) pro- 
vided additional discrepancy measures. 


Experimental Design and Analysis 


Simulation experiments were conducted to determine 
the appropriate set of algorithms and parameter val- 
ues that minimized discrepancies between simulated 
and observed data. Parameters from the simulation 
model were factors in the experiment, and one run of 
the simulation constituted an experimental unit (EU). 
Once the experimental design was selected and the 
simulation runs completed, burn-in time was evalu- 
ated for each summary outcome for each EU. Burn-in 
time for the experiment was the maximum burn-in 
time of all evaluated summary outcomes for all EUs. 
Statistical analyses were performed on the DMs, cal- 
culated using data from the time interval following the 
burn-in time for the experiment. 

When evaluating large numbers of model parame- 
ters, effects of each parameter on simulation outcomes 
are often complicated and difficult to identify. Factor- 
ial experiments accomplish the task of simultaneous 
investigation of the effects of many factors (i.e., simu-_ 
lation model parameters). Moreover, the ensuing 
analysis of variance (ANOVA) of these experiments 
can include interaction terms that explain interrela- 
tionships among the parameters of the simulation. 
However, as the number of investigated factors in- 
creases, the number of EUs required to examine all 
possible factor combinations increases rapidly. For ex- 
ample, one replicate of a factorial experiment with p 
factors, each with k levels, requires k? EUs. Other ex- 
perimental designs (i.e., fractional factorials) make 
more efficient use of resources by requiring a minimal 
number of EUs. Response surface methodology pro- 
vides a collection of specific experimental designs and 
statistical techniques to facilitate the estimation of fac- 
tor settings that optimize a response variable (Khuri 
and Cornell 1987; Montgomery 1991). 

"Typically, a sequence of experiments is necessary to 
optimize the simulation model parameters, with the 
analysis results of each experiment dictating the par- 
ticulars of the following experiment. First-order de- 
signs are used as initial screening experiments to esti- 
mate and test main effects and interactions among the 
factors. Common first-order designs are 2? factorials 
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and fractions of 2? factorials (Cochran and Cox 1957; 
Montgomery 1991). Fractional factorials reduce the 
number of required EUs with the assumption that 
higher-order interactions (i.e., three- and four-way in- 
teractions) are negligible. This first-order experiment 
would determine if an optimum was attained inside 
the experimental region. If an optimum was not at- 
tained inside the initial experimental region, the 
method of steepest descent was used to establish the 
parameter settings for another experiment more likely 
to contain the optimum (Khuri and Cornell 1987; 
Montgomery 1991). This process was repeated, result- 
ing in a sequence of experiments. 

When statistical analyses indicated that the design 
settings were close to an optimum (i.e., the DM is 
close to zero inside the experimental region), addi- 
tional experimentation was necessary to identify a set 
of parameter settings that minimizes the DM (a local 
minimum). More specifically, a second-order design, 
estimating main effects, first-order interactions, and 
quadratic effects, was often required to approximate 
the curvature of the true response surface. Common 
second-order designs are central composite designs, 3? 
factorials, and 3? fractional factorials (Cochran and 
Cox 1957; Khuri and Cornell 1987). 


Visual Assessment of the Simulation Model 


An additional component of model verification and 
calibration was the visual comparison of simulation 
output and observed data. Current geographic infor- 
mation systems (GIS) software is easy to use, accessi- 
ble, and very powerful. Even if the DMs based on 
summary statistics demonstrate that simulation output 
is comparable to observed data, movement patterns 
also must be visually realistic. 


Case Study: White-tailed Deer 
in the Florida Everglades 


An IBSE simulation model of movement patterns of 
adult white-tailed deer in the Florida Everglades was 
developed. The model provided a means of exploring 
the patterns of habitat use of deer in response to fre- 
quent environmental catastrophes (e.g., tropical 
storms) and to different management regimes (e.g., 
water control). Maintenance of robust deer popula- 


tions in this system is important because deer are the 
major prey of the endangered Florida panther (Puma 
concolor coryi) (Maehr et al. 1990) and the bobcat 
(Lynx rufus) (Maehr and Brady 1986), as well as a re- 
source for human recreation. Furthermore, the statisti- 
cal methods used to calibrate this IBSE model provide 
a foundation for the development of future simulation 
models. The simulation model was implemented in 
C++ (Borland C++ Builder 3.0, Inprise), using object- 
oriented programming techniques. 

The 250-square-kilometer study area (Fig. 39.1) is 
located in the wet prairie/tree island ecosystem that 
extended from the Stairsteps Unit of the Big Cypress 
National Preserve (BCNP) south into Everglades Na- 
tional Park (ENP). It is bounded on the north by Loop 
Road, on the west by Lostmans and Dayhoff Sloughs, 
and on the east and south by Shark River Slough. 

The Everglades ecosystem is characterized by a 
subtropical climate with alternating dry winters 
(November-April) (May- 
October). Mean monthly temperature ranges from 


and wet summers 
14 degrees Celsius in January to 28 degrees Celsius 
in August. Mean annual precipitation is 136 cen- 
timeters, two-thirds of which falls between May and 
October (Duever et al. 1986). The onset and dura- 
tion of the wet seasons are highly variable; thus, pe- 
riods of either drought or flooding are common. 
Tropical cyclones (hurricanes and tropical storms) 
occur in this region of Florida at a frequency of one 
every three years (Gentry 1984) and often exacerbate 
the severity of floods. 

Topographically the region is nearly flat and is 
characterized by a southwestward sheet flow of water. 
The major plant communities on the study area are 
wet prairie (87 percent), typified by a hydroperiod of 
50-150 days, and small, widely dispersed, and slightly 
elevated hardwood tree islands (7 percent) (Duever et 
al. 1986; Miller 1993). The wet prairie is character- 
ized by a complex of grasses and sedges; the tree is- 
lands contain both temperate and tropical hardwoods. 

This white-tailed deer population was studied from 
1989-1995 (Boulay 1992; Sargent 1992; Zultowsky 
1992; Miller 1993; Sargent and Labisky 1995; Mac- 
Donald 1997; Labisky et al. 1999). Estimated densi- 
ties for 1990-1992 averaged 3.65 (SE = 1.47) deer per 
square kilometer for the hunted BCNP population and 
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Figure 39.1. Habitat within the white-tailed deer study area in Big Cypress National Preserve (BCNP) and Everglades National Park 


(ENP). Map developed by Miller (1993). 


4.68 (SE = 1.00) deer per square kilometer for the 
nonhunted ENP population (R. F. Labisky unpub- 
lished data). These deer are nonmigratory (Loveless 
1959) and exhibit a high degree of site fidelity (Mac- 
Donald 1997). The cyclic rising and falling of water 
levels influences movements (Sargent and Labisky 
1995; Zultowsky 1992), habitat use (Hunter 1990a; 
Miller 1993), and reproductive phenology (Loveless 
1959; Richter and Labisky 1985; Boulay 1992). 


Data Collection 


The data set used for model calibration included year- 
ling and adult deer that were captured, radio-collared, 
and monitored between 1989 and 1992. Due to the 
inaccessibility of the study area, all radio-monitoring 
was conducted during daylight hours from a fixed- 
wing aircraft. To obtain unbiased temporal monitor- 
ing, radio locations for each deer were evenly distrib- 


uted among four daylight periods: sunrise to two 
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hours post-sunrise, two hours post-sunrise to noon, 
noon to two hours pre-sunset, and two hours pre-sun- 
set to sunset. Each deer was located, on average, once 
every five days. The location error associated with aer- 
ial-based telemetry, estimated from blind placement of 
dummy radio collars, was equal to 30 meters (Miller 
1993). 

Data on forty-six yearling or adult deer, character- 
ized by a minimum of one year of radio-locations and 
no dispersal movements, were used for initial model 
calibration (Table 39.1). Twenty-four deer were radio- 
monitored for one year, eighteen deer for two years, 
and four deer for three years. For each deer, annual 
home range size was calculated using the 95 percent 
fixed kernel estimator with least squares cross valida- 
tion (Silverman 1986; Worton 1989; Seaman and 
Powell 1996). Average distance between consecutive 
radio-locations served as an indicator for the mini- 
mum distance a deer traveled during a five-day inter- 
val. Distance between centers of consecutive annual 
home ranges was calculated to access the degree of site 
fidelity. Percentage of radio locations occurring in 
each habitat was used to examine habitat use. 

Several raster maps of the study area were used, 
each with 20-meterx20-meter pixels (referred to as 
“20-meter pixels"). The habitat map, which had an 
estimated accuracy of 80.4 percent, contained seven 
habitat classes (Fig. 39.1). A spatiotemporal map of 
water depths was created using relative elevations of 
different habitats and mean monthly water depths 
recorded at the hydrological station (P-34) located in 
the wet prairie near the center of the study area. 


Model Parameterization 


The experiments presented in this chapter are two of a 
series of experiments used to calibrate the deer move- 
ment model. They illustrate the estimation of burn-in 
time and the use of experimental design to optimize 
model parameters. The focus of these two particular 
experiments, restricted to female deer, are scale of 
movement and home range parameters. Six model pa- 
rameters were evaluated in the following calibration 
experiments. Temporal factors (water levels), survival 
and fecundity parameters, and deer interactions were 
not included in the presented simulation experiments. 

Simulated deer made multiple daily movements, and 


TABLE 39.1. 


Average annual outcome measures for observed white-tailed 
deer in Big Cypress National Preserve and Everglades 
National Park, April 1989 to March 1992. 


Observed outcome Females (N - 29) Males (N z 17) 
Home range size 271 ha (20)? 316 ha (49) 
Distance between 
consecutive locations 686 m (29) 779 m (79) 
Distance between 
consecutive centers? 218 m (53) 243 m (56) 
Percent observations in 
each habitat (%) 
Wet prairie 53% (4) 25% (4) 
Herbaceous prairie 17% (3) 20% (3) 
Tree islands/scrub 17% (2) 33% (4) 
Willow/sawgrass 9% (1) 18% (2) 
Dwarf cypress prairie 1% (1) 2% (2) 
Cypress strand < 1% (< 1) 1% (1) 
Pine Exo (CS al) <% (<1) 
Mangrove 2% (2) 1% (1) 


aStandard error in parentheses. 
bFor females, N = 15, and for males, N = 7. 


Figure 39.2. Two-stage movement process of a simulated 
deer using 60-meter pixels for the first stage. This hypothetical 
deer moved from the upper left 20-meter pixel in the center 60- 
meter pixel to the lower right 20-meter pixel in the lower left 
60-meter pixel. 


the number of steps over a five-day interval was one of 
the simulation parameters assessed in the following cal- 
ibration experiments. Each step consisted of two stages: 
first, selection of a 40-meter by 40-meter pixel (“40- 


TABLE 39.2. 


Initial habitat relative affinity scores for simulated adult 
females. 


Relative 
Habitat Symbol affinity 
Wet prairie Awer 10 
Herbaceous prairie AupR 20 
Tree island ATRE 50 
Willow/dense sawgrass Awsa 50 
Cypress prairie/strand Acys 10 
Pine Apin 10 
Mangrove Aman al 


meter pixel”) or a 60-meter by 60-meter pixel (“60- 
meter pixel”), then selection of a 20-meter pixel within 
the new 40-meter or 60-meter pixel (Fig. 39.2). The size 
of the pixel in the first stage of movement was the sec- 
ond parameter evaluated in these calibration experi- 
ments. In the first step of each movement iteration, deer 
moved a maximum of one pixel in any direction, using 
either 40- or 60-meter pixels. During this step, a deer 
evaluated its surroundings and determined the proba- 
bility of moving to each pixel based on habitat and rel- 
ative location inside its home range. 

Each 20-meter pixel was assigned a relative affinity 
score based on habitat contained inside the pixel. Ini- 
tial values for the relative affinities (Table 39.2) were 
updated throughout the calibration process. Relative 
affinity scores for the 40- and 60-meter pixels were de- 
termined using the mean relative affinity of their com- 
ponent 20-meter pixels. 

The formation and maintenance of home ranges 
was controlled by two algorithms that incorporate the 
previous locations of the simulated deer. The length of 
this memory was the third parameter evaluated in the 
experiments. 

The homing beacon algorithm gave simulated deer 
an affinity for pixels closer to their home-range center. 
The location of the homing beacon, based on k previ- 
ous radio location coordinates (one every five days), 
was updated every five days using the moving average: 


= 1 
Xhome — nis HXi Xz t.t Sa) 
(39.4) 
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Figure 39.3. Illustration of homing beacon relative affinity cal- 
culations, with the homing beacon located southeast of current 
location (center pixel). Affinity scores to move (a) toward the 
south and (b) toward the east were averaged to give (c) the rel- 
ative affinity scores of moving to each of nine possible pixels. 


where (x, yj) were the coordinates of the most recent 
radio-location and the moving window was k radio- 
locations wide (i.e., the length of the deer's memory). 
The relative affinity scores for the nine pixels to which 
the deer could move were based on the direction of 
travel from the current location of the deer to the 
homing beacon (Fig. 39.3). To avoid simulated deer 
gradually shrinking their home ranges with move- 
ments concentrated around the homing beacon, the 
1*9. or $), was 


2 
reduced exponentially as a deer moved closer to its 


strength of the beacon, 6 (equal to 1, 


beacon: 
zis 
affinity, = po EAE. (39.5) 
ô otherwise 
for j = 1, 2, 3,..., 9, and where 8 was the relative 


affinity score, u was the distance from homing beacon 
at which the relative affinity was constant, and z was 
the distance from current location to homing beacon. 
The fourth and fifth parameters to be accessed in the 
calibration experiments were $ and p. 

Simulated deer had a stronger affinity for previ- 
ously visited pixels than for unfamiliar pixels. Relative 
affinity for a pixel j was defined as 


affinit A if pixel j visited during known memory 
inity, = 

Yi= 11 otherwise (32.6) 
for j =1,2,..., 9 and where A (> 1) was the relative 


affinity based on previous locations and the sixth sim- 
ulation parameter evaluated in the calibration experi- 
ments. 

After a simulated deer evaluated the eight sur- 
rounding pixels and its current location for these three 
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factors, the probability of moving to each pixel was 
calculated using the relative affinity scores. The rela- 
tive affinities for each factor were standardized by 
converting to probabilities using 
affinity 
D (39.7) 
affinity; 
j=l 
where p; was the probability of moving to pixel j for 
factor i (i = 1,2,3: habitat, homing beacon, and loca- 
tion memory, respectively) and affinity; was the rela- 
tive affinity score for pixel j for factor i. A weighted 
average of these probabilities was computed for each 
pixel using: 


mj = bti byt Ps (39.8) 
where pj, p2; and p3; were the probabilities of mov- 
ing to pixel j based on habitat, homing beacon, and 
previous locations, respectively. The deer chose a pixel 
for its next location based on a random draw from the 
multinomial distribution (m4, 12, T3, . . . , Tto), com- 
pleting the first stage of the movement step. 

The s 20-meter pixels (s = 4 if 40-meter pixels were 
used or s = 9 if 60-meter pixels were used for the first 
stage) contained inside the pixel representing the cur- 
rent location of the deer were evaluated based on 
habitat using the relative affinity scores. The scores 
were converted to probabilities: 


PN en 
b» affinity; 
j=l 


] 


(39.9) 


. , T^, were the probabilities of 
moving to pixel j based on habitat. The deer chose a 
20-meter pixel based on a random draw from the 


where 14, Ta V5 ,.. 


multinomial distribution (n4, 1^2, 1/3 , ... , v). 


Model Calibration 


The first experiment was conducted using a half frac- 
tion of a 26 factorial design (thirty-two EUs) (Tables 
39.3, 39.4). This design allowed estimation of all six 
main effects and all ten first-order interactions, as well 
as ten higher-order interactions; the higher-order inter- 
actions were assumed negligible and used as an esti- 


TABLE 39.3. 


Parameter settings for the first simulation experiment for adult 
females. 


Experiment factor Low High 


150 500 


Number of steps? 


First-stage pixel size 40 meters 60 meters 
Memory length 2 months 6 months 
Ab 2 10 

o° 2 4 

ud 1,000 meters 2,000 meters 


aNumber of movement steps per five-day interval. 

Relative affinity score for previously visited pixels. 
eMaximum affinity used in equation 39.5. 

dDistance from homing beacon at which affinity is constant. 


mate of experimental error. Each EU consisted of 
thirty deer simulated over a fifteen-year time interval. 
Seventy-two locations per year (one per five days) 
were used to calculate summary statistics. The out- 
come summary statistics were annual average home 
range size, distance between consecutive locations, 
distance between consecutive annual centers, and per- 
centage of observations in each habitat. Burn-in time 
for each EU for each outcome was estimated using re- 
peated measures analyses with tests for linear time 
trends. Based on an a-level of 0.01, the maximum es- 
timated burn-in time was four years, so summary data 
from the fifth year of the simulations were used to 
evaluate model parameters using ANOVA. 
Discrepancy measures (DMs) were calculated for 
annual home range size, distance between consecutive 
radio locations, and distance between annual centers 
using Equation 39.2 with B = 1. The DM for habitat 
use was calculated using Equation 39.3. Significance 
of main effects and interactions was based on relative 
importance, using the magnitude of the F-statistic and 
the effect size, with the goal of focusing on factors 
with the largest impact on the DMs. The home range 
size DM was reduced with fewer steps, 40-meter pix- 
els for the first stage of the step, larger à and 0, and 
smaller p. For distance between consecutive measure- 
ments, there was a significant interaction between 
first-stage pixel size and number of steps. If the first- 
stage pixels were 40 meters, more steps reduced the 
DM; however, if first-stage pixels were 60 meters, 
fewer steps reduced the DM (Fig. 39.4). The only 


TABLE 39.4. 
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Factor levels and outcomes of home range size and distance between consecutive locations from 
the fifth simulation year of the first calibration experiment. 


Number Pixel Memory Home Distance 

EU of steps? size (m)? length (mo) À b p(m) range (ha) (m) 

1 150 40 2 2 2 1,000 595 414 

2 150 40 2 2 4 2,000 523 431 

3 150 40 2 10 D 2,000 470 388 

4 150 40 2 10 4 1,000 269 393 

5 150 40 6 2 2 2,000 686 434 

6 150 40 6 2 4 1,000 336 416 

7 150 40 6 10 2 1,000 429 386 

8 150 40 6 10 4 2,000 403 417 

9 150 60 2 2 2 2,000 1,508 644 
10 150 60 2 2 4 1,000 688 623 
alit 150 60 2 10 2 1,000 821 565 
12 150 60 2 10 4 2,000 762 612 
13 150 60 6 2 2 1,000 1,037 653 
14 150 60 6 2 4 2,000 882 641 
15 150 60 6 10 2 2,000 1,001 5 
16 150 60 6 10 4 1,000 460 582 
17 “500 40 2 2 2 2,000 1,205 tial 
18 500 40 2 2 4 1,000 414 650 
19 500 40 2 10 2 1,000 634 632 
20 500 40 2 10 4 2,000 600 682 
21 500 40 6 2 2 1,000 750 737 
22 500 40 6 2 4 2,000 T23 T13 
23 500 40 6 10 2 2,000 836 648 
24 500 40 6 10 4 1,000 363 624 
25 500 60 2 2 2 1,000 1,929 1039 
26 500 60 2 2 4 2,000 15292 1017 
2i 500 60 2 10 2 2,000 1,605 982 
28 500 60 2 10 4 1,000 Tati 894 
29 500 60 6 2 2 2,000 1,830 1106 
30 500 60 6 2 4 1,000 810 929 
31 500 60 6 10 2 1,000 1,185 928 
32 500 60 6 10 4 2,000 981 962 


aNumber of movement iterations per five-day interval. 


bSize of pixels for the first stage of the movement step. 


model parameter affecting the DM for distance be- 
tween consecutive centers was $; a larger $ reduced 
the DM. The DM for habitat use was reduced with 
fewer steps. 

Based on these results, the factor levels for the next 
experiment were determined. Because the interaction 
between number of movement iterations and first- 


stage pixel size indicated two conflicting directions in 
the parameter space for minimizing the DM for dis- 
tance between consecutive locations, one direction 
was chosen for the next experiment. A pixel size of 60 
meters and fewer steps was selected for several rea- 
sons. Although 60-meter pixels did increase the DM 
for home range size, reducing the number of steps 
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Discrepancy (m) 


- 500 


325 
Number of steps 


40 In 5-day Interval 


150 
60 


First-stage pixel size (m) 


Figure 39.4. Interaction plot of the number of steps and first- 
stage pixel size for the distance between consecutive locations 
discrepancy measure for the first calibration experiment. 


decreased the DMs for both home range size and habi- 
tat use. Furthermore, run time for each simulation, an 
important consideration in computationally intensive 
simulations, was shorter with fewer steps. Levels of 
the four other factors were altered according to the 
Observed trends in this experiment. Relative to the first 
experiment, memory length remained the same, A and 
$ were increased, and p was decreased (Table 39.5). 
The second experiment was conducted using a half 
fraction of a 25 factorial design (sixteen EUs) (Table 
39.6). This design allowed estimation of all main ef- 
fects and first-order interactions. Three EUs at the cen- 
ter point of the design (number of steps = 200, mem- 
ory length = 4 months, A = 8, ọ = 4, and p = 1,250 
meters) were run to estimate experimental error 
because no higher-order interactions were estimable 
with this design. Again, each EU consisted of thirty 
deer simulated over a fifteen-year time interval, and 
the summary outcomes and DMs were calculated. 
Based on an a-level of 0.01, the maximum estimated 
burn-in time for each outcome for each EU was four 
years; therefore, summary data from the fifth year of 
simulation was used to evaluate model parameters. 
There were no significant interactions among 
model parameters (p » 0.05) for all four DMs. Dis- 
crepancies for home range size were reduced signifi- 
cantly with fewer steps, larger @, and smaller p. For 
distance between consecutive locations, changes in 
factor levels did not significantly affect the DM. Mean 
distance between consecutive locations ranged from 
458 to 842 metezs, approaching the observed mean of 


TABLE 39.5. 


Parameter settings for the second simulation experiment for 
adult femalese. 


Experiment factor Low High 
Number of steps? 100 300 
Memory length 2 months 6 months 
xe 4 q2 

Qd 3 5 

ye 750 meters 1,750 meters 


aFirst-stage pixel size was fixed at 60 meters for this experiment. 
bNumber of steps per five-day interval. 

CRelative affinity score for previously visited pixels. 

dMaximum relative affinity used in equation 39.5. 

€Distance from homing beacon at which affinity is constant. 


671 meters. DMs for distance between consecutive 
home range centers decreased as increased. The DM 
for habitat use was not significantly affected by the 
model factors; however, simulated deer were located 
more often in tree islands and less often in the wet 
prairie than observed deer, indicating a need to reeval- 
uate the habitat affinity scores. 

In addition to quantitative analyses of simulation 
results, qualitative observations also aided in the veri- 
fication and calibration processes. Movement paths of 
simulated deer and observed deer residing in the same 
geographic area were plotted and compared. For ex- 
ample, during a one-year interval, a simulated deer 
(parameters: number of steps over five days = 300, 
memory length = 6 months, A = 12, = 3, and u = 750 
meters) had a realistic movement pattern when com- 
pared to an observed deer (three-year old female) in 
the same geographic area (Fig. 39.5). However, of the 
thirty deer in this EU, only 50 percent had movement 
paths and habitat use patterns that were similarly real- 
istic, indicating further need for the refinement of the 
simulation model. 

In the subsequent experiment, number of steps was 
decreased, memory length and A were kept at the same 
settings, à was increased, and u was decreased. Habi- 
tat affinity scores were included as factors in the ex- 
periment. An initial estimate of burn-in time was now 
available, so seasonally fluctuating factors (i.e., water 
levels) were added to the model in later calibration ex- 
periments, and DMs also were calculated and evalu- 
ated for hydrologic seasons. 


TABLE 39.6. 
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Factor levels and outcomes of home range size and distance between consecutive 
locations from the fifth simulation year in the second calibration experiment.@ 


Number Memory Home Distance 
EU  ofsteps^ length (mo) À o p (m) range (ha) (m) 
dl 100 2 4 3 1750P. 792 499 
2 100 2 4 5 750 428 496 
e 100 2) 12 e. 750 436 458 
4 100 2 12 5 1,750 487 487 
5 100 6 4 3 750 502 503 
6 100 6 4 5 450. 528 493 
T 100 6 12 3 1,750 638 470 
8 100 6 12 5 750 356 481 
9c 200 4 8 4 17250 640 648 
10€ 200 4 8 4 1,250 598 653 
11° 200 4 8 4 1,250 720 635 
12 300 2 4 3 750 709 mST 
13 300 2 4 5 1,750 831 826 
14 300 2 12 o 1,750 953 741 
15 300 2 12 5 750 474 724 
16 300 6 4 3 150 1,044 842 
aT 300 6 4 5 750 538 734 
18 300 6 12 3 750 539 685 
19 300 6 12 5 qe 785 790 


aFirst-stage pixel size was fixed at 60 meters for this experiment. 
bNumber of steps per five-day measurement interval. 


cCenter point of the experimental design. 


@——* Observed deer 
@-——® Simulated deer 
Note: Height and width of box is 3 km 


Figure 39.5. Plot of radio locations of an observed adult fe- 
male and locations of a simulated deer over one-year interval 
in the same geographic region. 


Conclusions 


The set of calibration tools and the iterative approach 
developed and demonstrated in this chapter start a 
framework for the more rigorous evaluation of simu- 
lation models. The general approach and techniques 
could be used for development of other classes of sim- 
ulation models as well as IBSE models. Simulation 
model building is an iterative process; the developer 
must constantly reevaluate the model and update al- 
gorithms and parameter values to obtain the best fit 
possible. Burn-in time is explored as a nuisance pa- 
rameter of simulations, and a technique for its estima- 
tion is presented. The degree of realism of the simula- 
tion is quantified by the use of discrepancy measures 
for each outcome summary statistic of the simulation. 

Experimental designs and analysis techniques of re- 
sponse surface methods are presented as tools for the 
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calibration of IBSE simulation models. The use of 
these techniques aids in the search for the optimum 
model parameter settings that will best represent the 
system of interest. This systematic, iterative approach 
logically attacks this multifactor optimization problem 
and provides opportunities for improvement in the 
model with each simulation experiment. 

The experimentation revealed that the movement 
patterns of these deer can be modeled by the combina- 
tion of several algorithms, including a two-stage 
movement step and a memory that is updated over 
time. Our example demonstrates that the described 
techniques facilitate the attainment of the model algo- 
rithms and parameter values that minimize the dis- 
crepancies between observed and simulated data. The 
final simulation model was tested for its predictive ac- 
curacy using deer radio-telemetry data collected from 
the same geographic area during the flood of 
1994-1995 (Hartless 2000). The simulation model 


can be used by South Florida planners as they work 
toward restoration of the Everglades to better under- 
stand the impacts of altering water control regimes on 
the white-tailed deer population. 
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Predicting Species Presence and Abundance 


Frances C. James and Charles E. McCulloch 


n spite of the important environmental achieve- 

ments it has permitted, the provisions of the Endan- 
gered Species Act cover only a limited range of biodi- 
versity and resource problems. Managers would 
welcome practical ways to address additional prob- 
lems and advice about how to choose among manage- 
ment options (Guikema and Milke 1999). Even 
progress toward predicting species distributions effi- 
ciently would be a step in the right direction. 

This chapter reports on state-of-the-art examples of 
ways to accomplish that reduced objective, to produce 
maps for the probable presence, abundance, or ab- 
sence of terrestrial species across large geographic 
areas, in cases when detailed local data are not avail- 
able. Two chapters (Shapiro et al., Chapter 49; Pear- 
son and Simons, Chapter 52) apply simulation model- 
ing to related questions, and one (Angermeier et ai, 
Chapter 46) discusses processes regulating distribu- 
tions of stream fishes. With the recent advent of geo- 
graphic information systems (GIS) for storing geospa- 
tial data, the availability of large data sets for climate 
variables, and satellite imagery and aerial photogra- 
phy for vegetation and landform variables, the subject 
of modeling the distributions of organisms has be- 
come a major enterprise within the field of landscape 
ecology. In this sphere, predicting species occurrences 
is less a matter of identifying limiting resources and 
more a matter of finding broad-scale associations be- 
tween the distributions of taxa and combinations of 


values of readily available environmental variables. 
Some broad-scale environmental variables, such as 
those for landform, undoubtedly do limit distributions 
indirectly, but the motivation behind these studies 
seems tied more closely to the magnitude of biodiver- 
sity problems and the need for distribution maps. 

An important example of the concept described 
above is gap analysis, a habitat-association method 
that uses distributions of vegetation classes and GIS 
technology to predict distributions of vertebrates and 
other taxa (Scott et al. 1993). In addition to vegeta- 
tion and land-cover data, various landscape metrics 
taken from satellite imagery and climate data can be 
added as filters. Overlays, made with ArcView soft- 
ware, allow the identification of areas of high species 
richness or endemicity. Such areas that are not already 
protected are gaps, priority areas for ground surveys, 
and potential sites for new reserves. Gap analysis 
began in Hawaii and Idaho but has been extended by 
the U.S. Geological Survey to every state (Loomis and 
Echohawk 1999). 

GIS predictions about the locations of species 
within their geographic ranges can be made not only 
from analysis of overlays but also with statistical mod- 
els that work from data for the distribution of a focal 
taxon, combinations of associated environmental vari- 
ables, and mapping routines. Probabilistic methods 
for estimating the extent of errors of omission (predic- 
tion of absence when a species is present) and 
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commission (prediction of presence when it is absent) 
can be assessed and biases accounted for by various 
statistical approaches. Sometimes data are for individ- 
ual marked animals recorded at more than one loca- 
tion. A sample or full data set for accompanying envi- 
ronmental variables can be measured at the site or 
derived from other sources. Because a set of correlated 
environmental variables is considered simultaneously, 
the challenge of the analysis is to reduce complex mul- 
tivariate relationships and express them in clever ways 
that provide reliable predictions. 

Traditional multivariate methods such as multiple 
linear regression, principal components analysis, dis- 
criminant function analysis, and Mahalanobis dis- 
tance, which look for linear relationships, are being 
used. However, generalized linear models have gained 
in popularity as an alternative. They have the advan- 
tages of allowing some nonlinear responses, allowing 
a variety of distributions for the outcome variable, 
and accommodating prediction from a set of variables 
that may be continuous or discrete (Nicholls 1989). 
Generalized linear modeling includes multiple regres- 
sion, logistic regression, and Poisson regression as spe- 
cial cases. Austin et al. (1990) used quadratic logistic 
regression for predicting presence and absence. 
Vernier et al. (Chapter 50) used Poisson regression for 
predicting abundance. Two compound nonlinear 
methods are classification and regression tree analysis 
(CART) (see Breiman et al. 1984) and genetic algo- 
rithm for rule-set prediction (GARP) (see, e.g., Lees 
and Ritman 1991; Stockwell and Noble 1992). CART 
derives flexible, nonlinear regressions and attempts to 
find the minimal ranges of environmental variables 
that fit the data (see, e.g., Nix 1986 for BIOCLIM to 
describe an environmental envelope). GARP is a deci- 
sion tree and rule-induction approach, an artificial in- 
telligence rule-set method that uses expert-system 
rules, logistic regression, CART, and envelope-type 
models and produces distribution maps. 

Peterson et al. (Chapter 55) begin with known 
points of occurrence of a species and a set of environ- 
mental data for both these points and other places, 
where the species was not recorded. Then they apply 
“ecological niche modeling," which uses the GARP 
rule-set procedure (Stockwell and Noble 1992) to pro- 
duce distributio: maps. 


Collectively the chapters in this section show the 
various ways researchers are predicting broad-scale 
species distributions. Dreisbach et al. (Chapter 41) 
used habitat suitability index scores (USFWS 1981b) 
and GIS rules to predict distributions of species of 
fungi in a gap-type system, pointing out the dietary 
links between fungi, flying squirrels, and northern 
spotted owls in the Pacific Northwest. Fertig and 
Reiners (Chapter 42) applied both nonlinear logistic 
regression and classification and regression tree analy- 
sis to a combination of herbarium data for plant local- 
ities and environmental data. They produced proba- 
bilistic range maps for the entire state of Wyoming. 
Van Manen et al. (Chapter 43), with data for surviv- 
ing canker-resistant butternut trees in the Great 
Smoky Mountains National Park and associated habi- 
tat characteristics, used Mahalanobis distances and lo- 
gistic regression to predict the locations of other po- 
tential sites and used jackknifing and bootstrapping 
for validation. Debinski et al. (Chapter 44) used field 
information and multispectral satellite data in espe- 
cially small mapping units (0.25 hectare) for a set of 
montane meadows in the greater Yellowstone ecosys- 
tem. They used classification and regression tree 
analysis, discriminant analysis, and regression to map 
four types of plant communities. Various birds and 
butterflies were associated with each type. Gonzalez- 
Rebeles et al. (Chapter 57) removed filter variables 
one at a time from data for vertebrates in New Mex- 
ico to see how the predicted distributions would 
change. Zimmerman and Breitenmoser (Chapter 58) 
had radiotelemetry data for the locations of Eurasian 
lynx in the Jura Mountains of Switzerland. They used 
repeated runs of discriminant function analysis to pre- 
dict the potential distribution of lynx and found that 
the distribution of prey (roe deer and chamois) was 
more important to the predictions than were habitat 
variables. Shriner et al. (Chapter 47) had data for the 
presence and absence of the wood thrush from point 
counts in Great Smoky Mountains National Park. 
They used logistic regression to predict occurrence 
throughout the park. In this case, topographic and 
landform indices were more useful than were local 
habitat variables. Fleischman et al. (Chapter 45) used 
data for butterflies in the Great Basin to demonstrate 
their recommended two-stage modeling process. They 
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used multiple linear regression to predict species rich- 
ness from environmental variables and then logistic 
regression to identify environmental variables that ac- 
counted for the presence of individual species. Vernier 
et al. (Chapter 50) used Poisson regression to predict 
the abundances of birds from forest inventory data in 
the boreal forest of Alberta, Canada. They used 
Akaike’s information criterion (AIC, see Burnham and 
Anderson 1998) to select the final model. The data 
were taken at two scales, both of which were relevant 
to forestry practices, and links to simulation models 
for management options are planned. 

The best way to check the accuracy of prediction 
methods is to try them first in a place where the distri- 
butions are known and predictions can be checked. 
That was tried by Karl et al. (Chapter 51), who used 
simulations to explore whether rarity has an effect on 
model accuracy over and above the effect of small 
sample size. They concluded that it does not. The 
availability of extensive field data for seven species of 
birds in Montana was used to validate their simulated 
results. Dettmers et al. (Chapter 54) began with point- 
count data for six species of birds and values of re- 
motely sensed spatial data at the same sites in Ten- 
nessee. They compared predictions made by logistic 
regression, Mahalanobis distance, classification and 
regression trees, and discriminant function analysis for 
each species and decided that the method of classifica- 
tion and regression trees is best. Their judgment was 
based on jackknife tests and on how well the final, re- 
duced models predicted occurrences in Georgia and 
Virginia. Hepinstall et al. (Chapter 53) used field data 
for birds in Maine to make a comparison between 
gap-type habitat-association modeling and a statistical 
approach that used Bayes’ theorem to model associa- 
tions with data from satellite imagery, a vegetation 
and land-cover map, and two derived layers of hetero- 
geneity. They found that both methods overestimated 
the distributions of generalist-type species and that the 
overestimation was greater with the gap-type analysis. 

Stockwell and Peterson (Chapter 48) are propo- 
nents of combining new data-mining techniques with 
flexible bias management within the GARP rule-set 
procedure (Stockwell and Noble 1992). The example 
here relates Breeding Bird Survey data (Sauer et al. 
1999) for the wood thrush to data for mean annual 


temperature and precipitation. They explore ways to 
control three types of potential bias in predictions 
made from such data. 

Of the two chapters reporting simulation work, 
Pearson and Simons (Chapter 52) used landscape-level 
metrics and an individual rule-based simulation model 
to compare the relative likelihood of successful 
stopovers during migration of birds that are habitat 
specialists and generalists. The objective here was to 
generate hypotheses for future work. Shapiro et al. 
(Chapter 49) were interested in the probability of par- 
asitism by brown-headed cowbirds in nests of the en- 
dangered golden-cheeked warbler and black-capped 
vireo in Texas. They applied an individual rule-based 
simulation approach to female cowbird movement. 
Such models can be constructed to mimic the behavior 
of individual animals as they move among spatial 
units (pixels) on a grid having different environmental 
values. 

Recently, researchers have begun to combine sub- 
models using the spatially explicit rule-based ap- 
proach with submodels using the statistical ap- 
proaches and even submodels using state-variable 
approaches. An example is the paper by Gross and 
DeAngelis (Chapter 40), which describes the ambi- 
tious Across Trophic Level System Simulation for the 
Everglades. It begins with a broad-scale state model 
for the hydrological system and uses regional GIS and 
remote sensing to assess the impact of alternate man- 
agement scenarios. Submodels have higher levels of 
resolution, the highest being detailed individual-based 
models for species of special interest. This project is 
squarely in the new field of computational ecology. 

The only chapter in this section that addresses the 
analysis of causes of discontinuous distributions is the 
one by Angermeier et al. (Chapter 46) on stream fishes 
in Virginia. It discusses the complicating roles of pop- 
ulation dynamics, density dependence, and temporal 
variation in multiscale analyses. For example, if popu- 
lation density is low, habitat associations may appear 
to be weak, even if they are in fact strong. This impor- 
tant paper does not present a quantitative analysis, 
but it highlights issues not mentioned elsewhere in the 
section. See also Hobbs and Hanley (1990). 

Biometric approaches to modeling distributions of 
species and communities based on environmental 
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variables began thirty years ago with studies of the 
distribution of birds in relation to the structure of the 
vegetation in their territories (Cody 1968; James 
1971; Anderson and Shugart 1974). Wildlife biolo- 
gists and land-use planners at that time became hope- 
ful that such work would have practical applications 
for assessing the effects of habitat treatments, but in 
the 1970s communication between researchers and 
managers was poor (Verner 1981). Many of the habi- 
tat-suitability-index models and habitat-evaluation 
procedures that followed had low predictive power 
(Stauffer and Best 1986). There were complaints 
about the reliability of the results of applications of 
multivariate analysis reported in habitat modeling 
studies (Johnson 1981a; Rexstad et al. 1988) and con- 
cerns about the lack of attention to issues of scale 
(Wiens 1981b). Managers were not in fact finding 
such models as useful as they had hoped (Marcot 
1986). 


Data Could Be Improved in Future Work: 
Six Issues of Accuracy and Scale 


The new generation of models in the 1990s uses GIS 
hardware and software to create spatial analyses and 
makes use of new broad-scale environmental data- 
bases (Morrison et al. 1998). Remote sensing, vegeta- 
tion classification, and landform and climate data are 
often used to link the environment with the focal 
species or community so that predictions can be made. 
Even though access to such information is exciting 
and opens new possibilities (see, e.g., Sillett et al. 
2000), whether landscape-scale predictors of biodiver- 
sity will provide useful data to managers is still uncer- 
tain (Short and Hestbeck 1995), and software for spa- 
tial statistical analysis of environmental data in a 
space-time framework that properly accommodates 
spatial correlation in the data is not yet available 
(Cressie and Ver Hoef 1993). In the meantime, here 
are six, by no means exhaustive, ways we think issues 
of accuracy and scale in predictions of presence and 
abundance of species from environmental data could 
be improved in future work. These suggestions do not 
originate with us. The same scientific principles and 
statistical requirements that applied to former genera- 
tions of models (Johnson 1981a) still hold. Attention 


to the first four ideas will improve predictions, but 
it will take all six to make the predictions useful to 


managers. 


1. Use more a priori thinking. 

Develop a short list of candidate models for 
testing. Automated statistical techniques cannot be 
expected to sort out complicated multidimensional 
relationships without biological input (Box and 
Hill 1967; Burnham and Anderson 1998). As a re- 
sult, the final model often includes too many 
terms, and both the composition of terms in the 
model and its predictions may be unstable. Think 
hard about the choice of variables and the likeli- 
hood that they constrain the distribution or func- 
tioning of organisms. For example, thermometers 
and rain gauges are cheap, but psychrometric vari- 
ables that are a function of temperature and mois- 
ture jointly, like absolute humidity and wet-bulb 
temperature, are more likely to affect processes in 
living organisms than are the more conventional 
factors, dry-bulb temperature and precipitation 
(James 1991). 

2. Pay more attention to model selection. 

The process of model selection often has as 
much to do with the final interpretations as it does 
with the fit of the decided-upon model. Nonpara- 
metric assessments of the model like the bootstrap 
(Efron and Gong 1983; Manly 1997) should be 
used to assess stability honestly. By that we mean 
that the model-selection process should be repli- 
cated, not just the fit of the final model, especially 
when complicated model-building processes are 
used and when the ratio of data to number of 
model terms is small. Burnham and Anderson 
(1998) advocate selection of the approximately 
best model, the one that simultaneously accounts 
for the most variation in the data with the fewest 
terms. The addition of significant terms in the 
model may not improve Akaike’s information cri- 
terion. 

3. Validate the final model. 

Habitat-based models have rarely been vali- 
dated in the past (Chalk 1986; Raphael and Mar- 
cot 1986; Hansen et al. 1999), but validation is 
crucial if we are to understand the accuracy of 
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predictions. The best validation is accomplished by 
repetition of the modeling exercise with newly ob- 
tained data from the population of interest (Taylor 
1990; Chatfield 1995). This method often is not 
feasible, and then we must resort to resampling 
methods (Verbyla and Litvaitis 1989) or methods 
that split the data. If the data are split for valida- 
tion purposes, they should still all be used to pro- 
duce robust classification rules (Rencher 1995). 

. State the limitations of the proper interpretation of 
the results. 

Remember that tests of regression coefficients 
do not tell you that the variables in the final model 
are in fact important to the species, just that they 
are jointly related to the presence of the species. If 
the objective is to produce cost-efficient range 
maps and locate sites of high conservation value 
for future on-the-ground surveys and management, 
then the job is simply to improve confidence about 
their accuracy. Reliable inference about causes is a 
false expectation from empirical modeling work 
such as the studies reported in this section. 

. Distinguish between hypothesis testing as statistical 
inference and hypothesis testing as analyzing the 
causes of processes. 

With insightful analyses involving substantial a 
priori thinking, some progress can be made toward 
the discovery of environmental factors that man- 
agers might change to help a species. However, if 
the real challenge for managers is to assure the 
long-term health of representative ecosystems, then 
the task is much more difficult than evaluating sta- 
tistically significant associations from empirical 


models. To be useful to managers, predictions will 
have to be more focused on analyses of those envi- 
ronmental factors that are directly limiting and 
those that can be manipulated. The most obvious 
factors are those related to ecological succession, 
such as the history of fire. Inferences about 
processes that constrain distributions will be weak 
until they are set up as hypothesis tests that com- 
pare distributions among sites having different 
combinations of levels of environmental factors in 
an experimental design. The entire step of working 
experimental design into observational studies 
(Cochran 1983; James and McCulloch 1985, 
1995; Rosenbaum 1995), which should come be- 
tween the empirical predictions described in this 
chapter and making management plans, is not cur- 
rently being addressed. Then adaptive manage- 
ment, as described by Kish (1987), can begin. 

6. Think about the scales of processes that may be 
constraining distributions horizontally as well as 
vertically. 


We manage places in patches, which are defined 
vertically, but processes that limit populations must be 
operating horizontally across their geographic ranges. 
Saving biodiversity by setting aside protected areas 
may succeed, but only by the accumulated influence of 
a large number of populations of individual species 
with individual requirements across broad horizontal 
scales. Even if single-species management is an inade- 
quate management approach (Moss 2000), single- 
species research and on-the-ground fieldwork are still 
needed for progress in understanding how nature 
works. 
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Multimodeling: New Approaches for 
Linking Ecological Models — 


Louis J. Gross and Donald L. DeAngelis 


he Everglades region of South Florida presents 

one of the major natural system management 
challenges facing the United States. With its assort- 
ment of alligators, crocodiles, manatees, panthers, 
large mixed flocks of wading birds, highly diverse sub- 
tropical flora, and sea of sawgrass, the ecosystem is 
unique in this country (Davis and Ogden 1994). The 
region is also perhaps the largest human-controlled 
system on the planet in that the major environmental 
factor influencing the region is water, and water flows 
are managed on a daily basis—subject to the vagaries 
of rainfall—by a massive system of locks, pumps, 
canals, and levees constructed over the past century. 
The changes brought about by such control have led 
to extensive modifications of historical patterns and 
magnitudes of flow, causing large declines in many na- 
tive species, extensive changes in nutrient cycling and 
vegetation across south Florida, and great increases in 
pollutants such as mercury. Constrained by the con- 
flicting demands of agriculture, urban human popula- 
tions, and wildlife for control of water resources, and 
the varying agendas of hosts of government agencies 
and nongovernmental organizations, there is now an 
ongoing effort to plan for major changes to the system 
with expenditure estimates of eight billion dollars or 
more over the next several decades (USACOE 1999). 
Carrying out such planning, particularly as it impacts 
the natural systems of the region, provides one of the 


major challenges to the new field of computational 
ecology. 


Computational Ecology and 
Regional Assessment 


Computational ecology is an emerging multidiscipli- 
nary field, similar in concept to the cell and molecular 
emphasis of bioinformatics, which applies modern 
computational methodology to key problems at higher 
levels of biological organization. The goal of compu- 
tational ecology is to combine realistic models of eco- 
logical systems with the often-large data sets available 
to aid in analyzing these systems, utilizing techniques 
of modern computational science to manage the data, 
visualize model behavior, and statistically examine the 
complex dynamics that arise (Helly et al. 1995). Suc- 
cess in applying this new tool to regions such as the 
Everglades requires expertise in diverse areas such as 
field biology, complex systems theory, computational 
science, remote sensing, and mathematical modeling, 
as well as the development of mechanisms to link ap- 
proaches that are at the forefront of research in many 
of these fields. 

The vast majority of theory in ecology has started 
from very simple differential equations in which a 
single variable represents population densities; solu- 
tions of these are analyzed mathematically and may be 
compared to abundance estimates from field or lab 
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observations. Although these models have had great 
influence on theory in ecology, their aggregated form 
is particularly difficult to relate to observational biol- 
ogy. Their application to complex natural systems 
with spatially and temporally varying environmental 
factors that force the system leads to models that are 
not analytically tractable and which must be investi- 
gated numerically. The study of equilibrium and sta- 
bility behavior typical in mathematical ecology is of 
little relevance for natural systems with strong abiotic 
forcing. 

One of the most important functions of the envi- 
ronmental sciences today is to analyze the impacts of 
human actions on ecosystems and to provide manage- 
ment recommendations in order to ameliorate these 
impacts. In all parts of the world, ecosystems are af- 
fected by the shrinkage and dissection of natural 
areas, disruptions of natural cycles, and the input of 
pollutants. Ecological assessment includes the determi- 
nation of the impacts of various anthropogenic influ- 
ences on a natural system. Common components of 
such an assessment would include the following: 


e Changes in population densities of species consid- 
ered “important” for either cultural or economic 
reasons, including endangered species such as the 
West Indian manatee (Trichechus manatus) and the 
Florida subspecies of the cougar (Puma concolor 
coryi) (hereafter the Florida panther). 

e Changes in species composition of the system, par- 
ticularly introductions of non-native species and as- 
sociated issues of hybridization with species cur- 
rently present, an example being the rapid spread of 
invading melaleuca trees in the Everglades. 

* Changes in community structure (which may not 
necessarily be associated with biodiversity changes), 
such as the shift from sawgrass- to cattail-domi- 
nated wetlands. 


Stressors requiring assessment include: 


* Effects of pollutant inputs, an example being the 
high mercury levels in fish throughout the Ever- 
glades. 

* Direct effects of human actions on the system, such 
as hunting, deforestation, and sewage/waste dis- 


posal, an example being the effect of poaching on 
the Florida panther. 

e Indirect effects of human actions, including habitat 
fragmentation, soil erosion, and salinity changes, an 
example being the increase in salinity in Florida Bay 
due to reduced fresh-water flows through Ever- 
glades National Park. 


The spatial extent of the effects of anthropogenic 
impacts range from local (tens of meters) to regional 
(hundreds of kilometers) and therefore require assess- 
ments that can span these scales as well. Regional 
problems involve controls and dynamics changing 
over time periods far longer than the one- to five-year 
periods typical of academic research projects. In addi- 
tion to Everglades restoration, projects involving ex- 
tensive computational components are ongoing in the 
Columbia River Basin, the Sacramento River Basin, 
and the Southern Appalachians, to name just a few of 
the efforts in the United States. Dealing with such re- 
gional problems has led to the use of remote sensing 
data and geographic information systems (GIS). Still 
very much in their infancy are methods to couple GIS 
with information on tracking of animal movements 
(now feasible for large numbers of organisms due to 
the availability of inexpensive and highly reliable 
radio tags), site-specific studies of animal behavior, 
and dynamic models. Yet, there are immediate needs 
for rigorous scientific methods to couple available 
data with realistic ecological models to assess the po- 
tential impacts of alternative management scenarios. 


Multimodeling and the Everglades 


It is exactly this need for assessment of the long-term 
(e.g., over thirty years) impacts of alternative hydro- 
logic scenarios on biotic components of the natural 
systems in South Florida that has led a large group of 
collaborators to develop an approach we call Across 
Trophic Level System Simulation, or ATLSS (DeAnge- 
lis et al. 1998). Just as the Atlas of mythology bore the 
world on his shoulders, we expect the methodologies 
we are developing to provide a firm basis for bearing 
the weight of ecological impact analysis for many nat- 
ural systems across the planet. Key to this is our use of 
a multimodeling methodology (Fig. 40.1) in which we 
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Across Trophic Level Modeling 
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Figure 40.1. A general approach to across-trophic-level model- 
ing is illustrated with four different modeling approaches, each 
with a range of spatial extents over which the variables utilized 
within the models act. Individual-based models are the most lo- 
calized, involving the largest numbers of variables, with associ- 
ated low levels of aggregation. Individual behavior depends in- 
herently upon local conditions, not just on an average at larger 
spatial extents. Index models typically produce a single index 
value per location and may be constructed at a wide variety of 
spatial extents depending upon the level of spatial resolution 
available for the input data and the model assumptions. Com- 
partment models and structured population/community mod- 
els are intermediate in the spatial extents appropriate for their 
variables. Structured population and community models take 
account of the details of size, age, and physiological state 
structure within a population or community by breaking down 
populations and communities into discrete classes. Within a lo- 
calized grid cell, such models can include 100-1,000 state 
variables. The hatching illustrates the temporal resolution ap- 
propriate for these models. In the ATLSS case, the index mod- 
els operate on yearly time steps while the individual-based 
models have time steps of less than a day. 


argue that the use of any single modeling approach is 
inappropriate for problems spanning a wide variety of 
temporal, spatial (grain and extent), and organismal 
scales. 

The application of mathematical models in most 
fields involves determining whether the key features of 


the problem under consideration require a discrete or 


continuous formulation, for example populations with 
discrete life stages as compared to those with overlap- 
ping generations. A second dichotomy concerns 
whether stochastic factors should be included or ig- 
nored, that is if unpredictable abiotic environmental 
factors must be explicitly modeled or whether the av- 
erages of such variations are sufficient to describe and 
analyze the situation. Deciding these basic modeling 
issues helps to specify the most appropriate single 
modeling approach for a particular situation, such as 
a matrix approach for structured populations in dis- 
crete time, an ordinary differential equation system 
for discretely structured populations in continuous 
time, or a partial differential equation for a continu- 
ously structured population in continuous time. It is 
quite atypical for the modeling framework used for a 
particular problem to utilize more than one modeling 
approach. Rather, as the modeling process proceeds, 
and additional complexities of the system under study 
are being considered, a new approach might be taken. 
Thus, a simple population model for abundance dy- 
namics could be phrased as a single differential equa- 
tion, but when age structure is added, the model for- 
mulation would then involve a system of differential 
equations. 

Unlike the situation described above, ATLSS relies 
upon a variety of different mathematical approaches, 
a process known as multimodeling. This mixture of 
approaches is based upon the inherent temporal and 
spatial resolution and extent of various trophic com- 
ponents linked together by spatially explicit informa- 
tion about underlying environmental (e.g., water, soil 
structure, etc.), biotic (e.g., vegetation), and anthro- 
pogenic (e.g., land-use) factors. The approaches cur- 
rently in use include spatially explicit indices, com- 
partment models, differential equations for structured 
populations and communities, and individual-based 
models. Linking models that operate at very different 
spatial and temporal extents has been a major chal- 
lenge, requiring a variety of spatial interpolation 
methods (Luh et al. 1997) and careful design of model 
interfaces (Duke-Sylvester and Gross 2001). Due to 
the modular nature of the project and the desire for 
long-term flexibility in what submodels may be in- 
cluded, we chose from the start an object-oriented de- 
sign, utilizing C++ software. 
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Figure 40.1 illustrates the variety of dynamic ap- 
proaches utilized in ATLSS relative to the spatial and 
temporal resolutions of the trophic levels involved. 
The hierarchical pattern in Figure 40.1 for the compo- 
nents of ATLSS arises from the association of different 
model types with different species or groups of species 
in the ecosystem that have different typical spatial ex- 
tents appropriate for the variables in the models. The 
spatial extent indicated ranges from the finest resolu- 
tion currently available for certain system compo- 
nents, 30 meters from Landsat image information, to 
regional extents on the orders of hundreds of kilome- 
ters. The horizontal edge of one of the boxes indicates 
the range of spatial extents over which variables for 
that type of model may reasonably be considered im- 
portant. Models that follow individuals necessarily re- 
quire localized information about those individuals in 
order to appropriately assess their interactions be- 
tween individuals that are inherently local. Although 
variables in these individual-based models may be av- 
eraged over larger spatial extents, these averaged vari- 
ables cannot then be used to assess the basic interac- 
tions within the model. Other approaches have 
variables, such as those indicated as index models, for 
which the underlying environmental factors can be es- 
timated at a wide range of spatial extents without af- 
fecting the basic assumptions of the model. Models 
such as those for structured populations have vari- 
ables that may be defined at an intermediate range of 
spatial extents over which it may be assumed that 
within this extent the spatial variability of environ- 
mental conditions may be reasonably averaged. 

The vertical axis of Figure 40.1 illustrates the com- 
plexity of the associated models, with the number of 
system variables used as a measure of such complex- 
ity. Alternatively, this complexity might indicate for 
the biotic system of interest the aggregation utilized in 
the description of its components. Some submodels, 
here indicated as compartment models typically ap- 
plied to lower trophic levels, including algae and zoo- 
plankton, can for practical purposes be well described 
by simple coupled differential equations. These de- 
scribe interactions between functional groups, defined 
here as groups of species that perform similar roles 
(on the same trophic level), with the forcing functions 
in these equations representing the effects of abiotic 


influences. Thus, a reasonable description of a zoo- 
plankton community can be reduced to one or a few 
state variables when the focus is on how zooplankton 
contribute to biomass fluxes at higher trophic levels. 
Other components of the biotic community may have 
a more complex organization that requires a struc- 
tured population or mixed-species community descrip- 
tion. Additionally, this structure may directly impact 
higher trophic components. For example, fish survival 
and growth are typically size dependent, implying that 
a single state variable for a fish population may be in- 
appropriate for many situations. The number of sys- 
tem variables needed to describe a structured popula- 
tion is higher than those needed when simply 
following population abundances or densities, thus 
leading to a lower level of aggregation. 

Higher trophic-level organisms are generally much 
more behaviorally complex and are able to display a 
repertoire of responses to environmental conditions 
and other organisms that cannot be captured in a 
small number of state variables. Each individual may 
behave differently based upon its current condition 
and current environment. For such organisms, an indi- 
vidual-based approach may be indispensable for deter- 
mining how the individual actions affect population- 
level phenomena. Adequate description of a single 
individual can easily require a number of variables de- 
tailing its current age, size, location, physiological sta- 
tus, and other information pertinent to its future ac- 
tions. When considering populations with hundreds or 
thousands of individuals, the total number of state 
variables can easily be in the tens of thousands. 

Although temporal resolution also has great impor- 
tance in this modeling scheme, it is not as straightfor- 
ward as spatial extent. In part, this is due to the fact 
that all populations are made up of individuals that 
carry out actions within short time frames. The rele- 
vant temporal resolution of importance here is that as- 
sociated with the system variables of interest. Thus, if 
one is dealing with zooplankton density rather than 
actions of an individual zooplanktor, the appropriate 
temporal resolution is not the seconds upon which in- 
dividuals move, but rather the many-day time frame 
over which significant changes in plankton density 
occur at the spatial extent chosen for the model. In 
Figure 40.1, the relevant temporal resolution should 
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be viewed as the longest period for which a temporal 
description of the model makes sense. At the individ- 
ual level, the relevant temporal resolution is set by the 
unit of time on which an individual can execute ac- 
tions that affect its survival and reproduction. Thus, 
for individual birds, weekly time steps are inappropri- 
ate since within a week an individual bird’s actions 
can have great impact on its survival or growth. For 
fish populations, however, in which the relevant state 
variables are size or age distributions, an upper bound 
on temporal effects may be weeks or longer. 

Clearly, the temporal and spatial resolutions chosen 
for any particular trophic level depend greatly upon 
the questions being addressed. Any of the trophic 
components described above can certainly be investi- 
gated at much finer resolutions than those indicated in 
Figure 40.1. There are, for example, questions of great 
interest to ecophysiologists regarding algal responses 
to fine-resolution environmental fluctuations. How- 
ever, if the focus is on the linkages between trophic 
levels, and in particular the effects of lower trophic 
levels upon the dynamics at higher levels, then we 
argue that it is appropriate, as illustrated in Figure 
40.1, to consider different spatial and temporal reso- 
lutions for different trophic levels. 


The ATLSS Models 


The ATLSS hierarchy (see http://atlss.org) starts with 
models that translate coarse-resolution hydrologic in- 
formation to a finer resolution appropriate for biotic 
components that operate at spatial extents much 
smaller than the resolution of the main hydrologic 
model. The development of ATLSS has been motivated 
to great extent by the efforts to analyze alternative 
restoration plans for south Florida through the Central 
and Southern Florida Project Restudy (the Restudy, see 
http://www.evergladesplan.org). Throughout the devel- 
opment of ATLSS, there has been extensive consulta- 
tion between modelers and field biologists with experi- 
ence in the Everglades. Indeed, the initial set of models 
included within ATLSS were chosen following discus- 
sions with field biologists as to which species would be 
important in evaluation of any restoration plan. 

The main hydrologic model used for the Restudy, a 
product of the South Florida Management District, 


has been utilized to produce all of the alternative plans 
for water management and typically provides a thirty- 
year plan of daily water depths across approximately 
1,700 spatial locations, each of which represents a 
2x2-mile (3.2x3.2-kilometer) grid cell. A complete de- 
scription of the methods used to produce a high- 
resolution hydrology model is available on the ATLSS 
Project Home Page (ATLSS). This model relies upon 
vegetation maps and associated limitations on hy- 
droperiod for each vegetation type to characterize a 
28.5-meter-resolution topography that preserves the 
volumes of water derived from the 2-mile (3.2- 
kilometer)-resolution hydrology model. 

The ATLSS hierarchy next includes Spatially Ex- 
plicit Species Index (SESI) models, which make use of 
the spatially explicit, within-year dynamics of hydrol- 
ogy to compare the relative potential for breeding 
and/or foraging across the landscape (Curnutt et al. 
2000). These models produce yearly comparisons of 
the spatial extent of the index across the landscape. 
SESI models are viewed as approximations that are 
useful in coarse evaluations of scenarios and aid in in- 
terpreting the more-detailed models. SESI models have 
been constructed and applied during the Restudy to 
the Cape Sable seaside sparrow (Ammodramus mar- 
itimus mirabilis), the snail kite (Rostrhamus socia- 
bilis), short- and long-legged wading birds, and white- 
tailed deer (Odocoileus virginianus). 

The compartment models involve biotic compo- 
nents for which it is reasonable to model variation 
across a landscape by means of many local uncoupled 
spatial unit cell models. The cell size chosen is small 
enough to represent a tract relatively homogeneous in 
substrate and elevation. A cell’s spatial extent might 
be several hundreds of meters in a relatively flat land- 
scape such as the Everglades. This is particularly ap- 
propriate for the primary producers (e.g., periphyton, 
aquatic macrophytes, terrestrial macrophytes), and 
meso- and macroinvertebrate consumers and detriti- 
vores, which can be combined into a few main func- 
tional groups. These models had not yet been devel- 
oped enough to adequately link them to hydrology, 
and so they were not applied to the Restudy. This low- 
trophic-level web is an important food resource for 
the higher trophic levels and for restoring higher- 
consumer populations. The validation procedure for 
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fish models has shown that it is critical to account for 
the temporal variability in these components (Gaff 
1999). 

The age- and size-structured population and com- 
munity models represent intermediate trophic levels, 
such as fish, macroinvertebrates, and small nonflying 
vertebrates. These populations may move short dis- 
tances in response to changes in water levels. Thus the 
model spatial domains of interaction are larger than 
for SESI models. A typical extent is up to one square 
kilometer, thus encompassing many unit cells coupled 
to allow population movements. Detailed models have 
been developed and applied in the Restudy to estimate 
responses of functional groups of fish across the fresh- 
water landscape (Gaff et al. 2000). These models con- 
sider the size distribution of large and small fish to be 
important to the basic food web that supports wading 
birds. It has been applied in order to assess the spatial 
and temporal distribution of availability of fish prey 
for wading birds. 

Individual-based models are employed to represent 
populations of top predators or other large-bodied 
species. Individuals of these species may move over 
large areas, with movements over short periods of 
time spanning areas of thousands of spatial unit cells. 
Such models are also used for organisms of particular 
interest, such as endangered species, for which de- 
tailed behavioral information is available and there is 
particular interest about the detailed response of the 
population to alternative plans. These individual- 
based models, an ecological form of agent-based mod- 
els (Duke-Sylvester and Gross 2001), are rule-based 
approaches that can track the growth, movement, and 
reproduction of many thousands of individuals across 
the landscape. Adequate description of a single indi- 
vidual can easily require a number of variables detail- 
ing its current age, size, location, physiological status, 
and other information pertinent to its future actions. 
ATLSS models of this type include the Cape Sable sea- 
side sparrow, snail kite, white-tailed deer, Florida pan- 
ther, and various species of wading birds. The models 
include great mechanistic detail, and their outputs 
may be compared to the wide variety of organism dis- 
tribution data available, including that from radio- 
collared individuals. An advantage of these more- 
detailed models is that they link each individual ani- 


mal to specific environmental conditions on the land- 
scape. These conditions (e.g., water depth, food avail- 
ability) can change dramatically through time and 
from one location to another, and they can determine 
when and where particular species will be able to sur- 
vive and reproduce. 


How ATLSS Was Applied 
during the Restudy 


The U.S. Army Corps of Engineers was charged in 
1996 by Congress to produce a restoration plan for 
the Everglades region of South Florida by 1 July 1999 
(see USACOE 1999 for details). The production of 
such a plan involved a massive effort on the part of 
numerous federal and state agencies. We cannot begin 
to describe here the full procedures for the develop- 
ment of the final plan presented to Congress, but we 
do want to describe the procedures used to apply 
ATLSS. The essential goal was to provide a rational, 
scientific basis for developing relative rankings of the 
biotic impacts of the proposed hydrologic scenarios as 
input to the planning process. 

A key feature of the ATLSS application is the re- 
liance on what we call relative assessment. By this, we 
mean that we do not claim that the variety of models 
within ATLSS can produce quantitatively accurate 
forecasts of population responses across the Ever- 
glades over the thirty-year plan. There are simply too 
many uncertainties in the models and the data used to 
estimate the parameters to attach any great confidence 
to the exact spatial variation in population estimates 
provided by the models. Rather, what we provided 
during the Restudy was a ranking of how various al- 
ternative plans caused the models to respond, relative 
to a base plan chosen by the agencies involved. 

The procedure used to apply ATLSS in the Restudy 
can be described as follows: First, a hydrologic plan 
was developed. Next, the ATLSS team (comprising 
eight individuals) had one week to run the model, 
evaluate them, and provide written comparisons of 
the results of each model to the base scenario and post 
the results and summaries on the ATLSS home page 
on the World Wide Web. There would follow several 
days of conversations with various individuals from 
agencies involved in assessing the scenarios (the 
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Alternative Evaluation Team) who would then make a 
set of recommendations to the Alternative Design 
Team (ADT). It was then the responsibility of the 
ADT to revise the hydrologic plan, run a new hydrol- 
ogy model based upon the design changes, and pro- 
vide a new plan. The turnaround time between hydro- 
logic plans was typically three weeks, and the ATLSS 
team carried out over fifteen complete evaluations of 
plans during a fourteen-month period. 


The Future 


The focus of ATLSS to date has been on the freshwa- 
ter systems with an emphasis on the intermediate and 
upper trophic levels. The ATLSS structure was pur- 
posely formulated to provide for extension to estuar- 
ine and near-shore dynamic models once physical sys- 
tem models for these regions are completed. This 
would involve the construction of a variety of addi- 
tional models for the biotic components. A further ef- 
fort at lower trophic levels in the freshwater regions is 
needed to account for the impact of hydrologic plans 
on vegetation change and associated nutrient fluxes. 
Closely linked with these would be models for the ef- 
fects of major disturbances to the system, including 
fire and hurricanes. Finally, ecotoxicological models 
coupled to transport models for toxicants such as mer- 
cury may be readily incorporated into the biotic com- 
ponents already constructed. 

In the more general context of future developments, 
computational ecology will allow us to investigate 
ecological models far more realistic than before. Its 
practitioners are driven by the need to improve predic- 
tive capabilities—to more accurately assess the future 
impact of human actions on natural systems. The dif- 
ficulty of manipulating and replicating experiments at 
regional extents raises issues about experimental de- 
sign, the regularity and longevity of sampling, and the 
integration and storage of data. Associated with this is 
the need for methods to archive complex data struc- 
tures arising from spatially explicit dynamic models, 
which we have only begun to address by utilizing 
object-oriented databases. Computational ecology 
also has the capability to suggest appropriate long- 
term monitoring plans for natural systems to aid 
adaptive management. 


A central issue in computational ecology is the need 
to link dynamic processes that operate across differing 
spatial regions and at different rates. How do we link 
natural and anthropogenic forces that influence the 
demand for biological resources with the dynamics of 
those resources? How much averaging and smoothing 
of high-resolution biological data must be done to 
match the lower resolution of geophysical data while 
still preserving the predictive capabilities of the ap- 
proach for the underlying natural systems? Much de- 
pends on what we wish to predict and what level of 
accuracy is needed for such a prediction to be useful. 
As always for such complex models, issues of model 
validation and error propagation are difficult to ad- 
dress. We argue, based upon our experience with the 
Restudy, that in many cases what we wish to produce 
are relative assessments of different scenarios rather 
than exact quantitative predictions for any particular 
scenario. Given the assumption that errors in parame- 
ter estimates and functional forms within such com- 
plex models do not interact differentially with changes 
in scenarios (a reasonable assumption for scenarios 
produced by methods external to the models being 
used to assess them), it is likewise reasonable to as- 
sume that errors propagate similarly in model runs for 
different scenarios. This may justify the use of highly 
complex models for relative assessments, though we 
have had little success mathematically proving such 
assertions. 

In addition to providing the ability to develop multi- 
modeling methods and the computationally efficient 
means of carrying them out, computational ecology of- 
fers the opportunity to combine spatially explicit eco- 
logical models with models that assess economic and 
social impacts. A related need is that of addressing 
problems of spatially explicit control (Hof and Bevers 
1998). Landscape-level management (e.g., forest har- 
vesting, water-flow management, conservation preserve 
design, etc.) is not an all-or-nothing affair that occurs 
uniformly in space. Rather, realistic management sce- 
narios must take into account spatial heterogeneity in 
underlying resources as well as how such heterogeneity 
interacts with management through time (local ecologi- 
cal succession for example). Given that there are many 
potential criteria affecting the system management, and 
that the underlying nonspatial issue might be viewed as 
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a multiple-criteria-optimization problem, how should 
the “control” of the system be applied spatially in order 
to carry out the optimization? This is a little-developed 
area of applied mathematics, particularly in systems in 
which stochastic factors interact with the management 
scheme. Yet, such control problems are at the heart of 
much of applied ecology today. 
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Challenges of Modeling Fungal Habitat: 
When and Where Do You Find Chanterelles? 


Tina A. Dreisbach, Jane E. Smith, and Randy Molina 


Fungal Habitat Modeling in the 
Pacific Northwest States 


In the past decade, land-use patterns have changed 
dramatically in the Pacific Northwest. As a result, land 
management practices are being scrutinized for im- 
pacts on valued species other than those that provide 
timber. In the early 1990s, the Northwest Forest Plan 
was developed to conserve biodiversity and species vi- 
ability by maintaining appropriate habitat. The record 
of decision (ROD) lists more than four hundred 
species of concern occurring in the range of the north- 
ern spotted owl (Strix occidentalis caurina). These or- 
ganisms include mammals, birds, amphibians, plants, 
lichens, mosses, and fungi. Like the northern spotted 
owl, the species of concern are considered old-growth 
dependent and are collectively referred to as "survey 
and manage" species. Of the four hundred species, the 
largest proportion (36 percent) are fungi (USDA/USDI 
19944). 

Within existing guidelines for the systematic evalu- 
ation of forest conditions, few methods exist for as- 
sessing species diversity and viability of fungi, yet 
fungi are important components of nutrient cycling, 
plant health, food webs, and ecosystem resiliency in 
forests (O’Dell et al. 1996). Furthermore, many are 
commercially valuable (Molina et al. 1993). Although 
fungal biodiversity has been studied in relation to par- 
ticular forest conditions (Bills et al. 1986; Villeneuve 


et al. 1989; Luoma et al. 1991; Nantel and Neumann 
1992; Keizer and Arnolds 1994; Zhou and Sharik 
1997; O’Dell et al. 1999; J. Smith unpublished data), 
little scientific information is available regarding the 
basic biology or habitat requirements for particular 
fungal species. This lack of information is a major im- 
pediment to the formulation of management recom- 
mendations and sampling procedures. 

Information needed for developing species conser- 
vation plans includes availability of habitat, response 
of the organism to habitat change, and population dy- 
namics. As a first step toward incorporating modeling 
into the study of forest fungi in the Pacific Northwest, 
we are developing habitat models. In the past several 
decades, habitat models have become decision-making 
tools for land managers, providing insight into ecosys- 
tem behavior over long periods and with hypothetical 
management scenarios. 

Modeling as a tool has had limited use in the field 
of mycology. Plant pathologists have recently used a 
variety of modeling techniques to investigate host- 
pathogen interactions (Thrall et al. 1997; Taylor et al. 
1998), pathogen fitness (Lannou and Mundt 1997; 
Newton et al. 1998), the effects of environmental fac- 
tors on fungal pathogens (Gumpertz et al. 1997), and 
to simulate consequences of disease (Frankel 1998). 
The primary goal for modeling plant pathogenic fungi 
is to be able to predict disease and plan control 
strategies. In contrast, our interest in modeling is for 
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conservation of forest fungi. Mathematical models 
have examined establishment and colonization of 
beneficial mycorrhizal fungi in roots, primarily for 
vesicular-arbuscular mycorrhizae in agricultural set- 
tings (Menge 1985; Tinker 1985; Walker and Smith 
1985). We believe that modeling also can serve useful 
purposes in the study of forest fungi, particularly in 
ecosystems where conservation management is an 
issue. 

Modeling can help mycologists predict the conse- 
quences of basic fungal processes such as dispersal, 
colonization, and reproduction. For forest fungi, we 
know virtually nothing about these processes. Model- 
ing can provide a formal organizing framework for 
generating ideas, conducting data analysis, and assist- 
ing in determining research directions by helping to 
generate hypotheses and define problems. Models can 
simulate and project over time and under a variety of 
disturbance and land-management scenarios to predict 
species occurrence and interactions. Models can help 
prioritize data collection efforts by predicting areas 
where centers of biodiversity exist or where more- 
intensive sampling may be necessary (Kiester et al. 
1996). In the Pacific Northwest, gap analysis is cur- 
rently being explored as a method for helping design 
regional survey strategies for forest fungi (T. O’Dell 
and R. Kiester personal communication). To that end, 
models also can provide a method of accountability 
and support to land-management decisions. Using 
models as ecosystem-based management tools will 
allow land managers to calculate risks of decisions 
and actions. This will be particularly helpful in the 
development of sustainable harvest strategies for 
commercially valuable fungi and in predicting the im- 
pact of human intervention on fungal survival and 
productivity. 

One of the primary needs of the USDA Forest Ser- 
vice and USDI Bureau of Land Management is devel- 
opment of habitat-based conservation plans. An im- 
portant question in fungal biology is, for a given 
habitat, what are the odds of finding a particular 
species or group of species? Underlying this question 
are three assumptions: (1) species are tightly linked bi- 
ologically to their habitats; (2) habitats can be de- 
tected, measured, and quantified; and (3) we are able 
to detect the species itself. Our work investigates habi- 


tat requirements and modeling of several different 
fungal species displaying various life strategies. Several 
are considered habitat specialists, possibly restricted 
to one or a few host species or a unique identifiable 
vegetation type. Others are habitat generalists, able to 
associate with a variety of host plants and vegetation 
types. Our long-range plan is to develop habitat mod- 
els encompassing multiple fungal species and life 
strategies, thereby fostering transition from single- 
species land management to multiple-species habitat 
management. 

We have begun by modeling several fungal species 
commonly referred to as chanterelles. The species in 
the chanterelle group contain three genera: Cantharel- 
lus, Gompbus, and Polyozellus. We chose this group 
for the following reasons: (1) All species are included 
in the ROD survey-and-manage listing (USDA/USDI 
1994a), but habitat information is minimal or lacking. 
(2) All may be considered “charismatic macrofungi”; 
they are easily recognized and identified, thereby in- 
creasing the possibilities of detection by field crews 
and others working in the forest. (3) These fungi cover 
the entire range from rare to weedy in occurrence, im- 
plying that habitat needs may be less stringent for 
some than for others. (4) The different life strategies 
displayed by our selected species are representative of 
other survey-and-manage fungi such that we can 
transfer results to developing models for other fungi. 
(5) Several species within this group are of interest as 
highly sought-after edibles and commercially valuable 
nontimber forest products. The objectives of our re- 
search are as follows: 


* Identify ecological factors that determine fungal 
habitat. 

* Develop spatial and statistical models to predict the 
occurrence of chanterelle species. In particular, the 
models are intended to predict the impact of envi- 
ronmental change at multiple geographic scales. 

* Apply these models across a broad geographic area 
in the Pacific Northwest. This will determine the 
ability to evaluate and predict desired forest condi- 
tions for maintaining viability of cantharelloid 
species while meeting ecosystem management objec- 
tives. 

* Assess how we might apply the models to other sur- 
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vey-and-manage fungi. Ultimately, this is necessary 
for multiple species-environment and species-habi- 
tat understanding. 


Modeling Approach 


Patton (1997) defines habitat as the environment and 
specific location where an organism lives, including 
the combination of factors where an organism can 
survive and reproduce. Guidelines for development of 
habitat models are given by the U.S. Fish and Wildlife 
Service (USFWS 1981b) and have been used to gener- 
ate models for mammals, fish, and birds. In this ap- 
proach, habitat variables are selected according to the 
following criteria: (1) Is the variable important to sur- 
vival and reproduction of the species? (2) Is there a 
basic understanding of the habitat-species relation for 
the variable? (3) Are data available and is the variable 
practical to measure? Unfortunately, few studies of 
fungi have directly assessed any of these criteria. 

Our most comprehensive database for survey-and- 
manage fungi consists of approximately five thousand 
records and includes information gleaned from histor- 
ical material housed in herbaria as well as collections 
made by the USDA Forest Service Fungal Survey and 
Manage Team and cooperators (T. O'Dell personal 
communication). Although at first glance there ap- 
pears to be an abundance of data; in fact, we lack the 
crucial information needed to develop models predict- 
ing fungal occurrences based on habitat. Because these 
data were collected by many different persons, statisti- 
cal analysis is problematic if not impossible. The main 
difficulty arises from the manner in which data were 
collected. Collection labels may contain little to no in- 
formation on habitat. For example, a subsample of 
386 chanterelle records revealed that less than 50 per- 
cent include information on the type of substrate on 
which a mushroom was found growing. Even when 
information is included it is often vague, for example, 
habitat listed only as dense woods or mixed conifers. 
Consequently, our options of modeling methods are 
limited. We do not know enough about life processes 
of forest fungi to develop quantitative simulation 
models. We also have no idea of how population dy- 
namics operate for most fungi, precluding population 
viability analysis. 


We therefore must make assumptions about our 
study organisms by basing habitat parameter estima- 
tions on published studies of related organisms. In 
some cases, anecdotal information or expert opinion 
also is used. Scientific researchers, commercial mush- 
room harvesters, and recreational forest users have a 
great deal of experiential knowledge as to when and 
where particular mushroom species may be found. 
These experts often can describe optimal habitat in ex- 
tensive detail. Accordingly, our approach is primarily 
qualitative: we use knowledge- and rules-based meth- 
ods and expert systems for predicting expected distri- 
butions. These modeling methods are useful in that 
both quantitative and qualitative information can be 
synthesized (Starfield and Bleloch 1991). In addition, 
the availability of digital maps and geographic infor- 
mation systems (GIS) provides opportunities for link- 
ing our modeling rules with spatial databases and sim- 
ulation models for other biological systems. In the 
following review, we discuss the development of rules 
for predicting the occurrence of suitable habitat for 
forest fungi in general and for chanterelles of the Pa- 
cific Northwest in particular. 


Species of Interest 


The chanterelle species all produce mushrooms with 
distinctive funnel shapes and spore-bearing folds. 
Table 41.1 lists the species chosen for the current 
study and their abundance in Pacific Northwest 
forests. 


TABLE 41.1. 


Species of interest: Cantharelloid fungi from the ROD (record 
of decision) list. 


Species Common name Abundance?à 
Cantharellus formosus ` yellow chanterelle weedy 

C. subalbidus white chanterelle common 

C. tubaeformis winter chanterelle common 
Gomphus bonari scaly chanterelle rare 

G. floccosus scaly chanterelle weedy 

G. kauffmanii scaly chanterelle common 

G. clavatus pig’s ears common 
Polyozellus multiplex blue chanterelle rare 


aAbundance: based on number of existing survey and manage and 
herbarium databases: Weedy = more than 100 records; common = 
50-100 records; rare = less than 50 records. 
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Geographic Scope 
Our efforts are concentrated in areas defined as poten- 
tial northern spotted owl habitat, including U.S. For- 
est Service lands in northern California, western Ore- 
gon, and western Washington. Ideally, the extent of 
our models would include this entire area, for which 
forest inventory databases are collected and main- 
tained by state and federal agencies. In addition, GIS 
coverages of vegetation, soils, and climate are avail- 
able for much of this region. In the 1990s, specific 
areas, such as the Coast Range of Oregon, the H. J. 
Andrews Experimental Forest (on the western slope of 
the Cascade Range), and the Willamette National For- 
est, have been the sites of many ecological and myco- 
logical studies. Our first models will focus primarily 
on forest types within these geographic areas. 
Difficulty arises from attempts to use existing data 
for construction of models with such a large extent. 
Many studies of fungi are conducted at a level of reso- 
lution of less than 1 meter and measure microhabitat 
factors such as the number of fruiting bodies occur- 
ring in a cluster or proximity to the nearest piece of 
down wood. In terms of grain, these studies may not 
be adequate to evaluate factors at the appropriate 
level of resolution for building large-area models. In a 
given forest type, habitat consists of a series of many 
discontinuous patches, and important biological 
processes to consider are host specialization, migra- 
tion, and extinction. At the local scale, one habitat 
area may equal one of a few patches, and processes 
such as competition, aggregation, and the match be- 
tween species niche and habitat becomes important. 


Temporal Scope 

Fungi live as aggregations (mycelia) of microscopic 
strands (hyphae). Mycelia inhabit soils, wood, litter, 
and living plants. Mycelial colonies can range in size 
from microscopic to many hectares and can persist for 
years (Smith et al. 1992; De la Bastide et al. 1994; 
Dahlberg and Stenlid 1995). Fungal reproductive 
structures are the parts readily seen in the forest in the 
form of cups, truffles, conks, and mushrooms. Timing 
of mushroom formation (and hence organism detec- 
tion) is species specific and generally occurs only when 
nutritional requirements and environmental condi- 


tions (temperature, light, pH, moisture) are appropri- 
ate during particular seasons of the year (Hunt and 
Trappe 1987; Luoma 1991). Therefore, it is necessary 
to define the season of the year for which the model is 
applicable for each species. Year-to-year variability in 
mushroom production also complicates modeling. 
Over a three-year period, Luoma (1991) documented 
a similar number of truffle species occurring each year; 
however, the proportions of those species differed 
greatly from year to year. 

Our survey-and-manage records indicate that 
mushrooms of all chanterelle species occur most often 
in fall. At northern locations or higher elevations, or 
both, fruiting begins as early as late July or early Au- 
gust. In the southern part of our modeling area, and at 
lower elevations, fruiting extends from October 
through January. One species, Cantharellus tubae- 
formis (winter chanterelle), is documented from many 
locations in nearly all months. Anecdotally, mush- 
rooms fruited prolifically in fall 1997 and poorly in 
fall 1998. The exact causes for this variability are un- 
known, but it is clear that this unpredictability re- 
quires incorporation of additional variables or sto- 
chasticity into the models. The purpose is not to define 
the suitability of habitat per se, but to give us an idea 
of the probability of detection. 


Determination of Important 
Ecological Factors 


Ecological factors often associated with fungi fall into 
three broad categories: vegetation, topography and 
soils, and climate. 

Vegetation may be the primary factor contributing 
to fungal habitat, primarily because of the levels of 
host specificity required for many fungi. Our study in- 
cludes several ectomycorrhizal species as well as 
species that serve a function as decomposers of dead 
material, thereby cycling nutrients. Ectomycorrhizal 
fungi (EMF) form mutually beneficial relationships 
with host plants by directly supplying nutrients in ex- 
change for carbon. Molina and Trappe (1982) recog- 
nize three groups of EMF based on the relationships 
to host species, ranging from highly specific to nonspe- 
cific. Decomposer fungi also show various levels of 
host-substrate specificity (Swift 1982). We therefore 
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can use the presence or absence of particular host tree 
species as an initial indicator for potential occurrence 
of many species of fungi. 
. Herbarium and collection records provide some in- 
formation regarding the presence of dominant tree 
species in stands where chanterelles occur. Cantharel- 
lus formosus (Pacific golden chanterelle) is docu- 
mented from forests dominated by many coniferous 
hosts, including Abies, Picea, Pinus, Pseudotsuga, 
Thuja, and Tsuga; C. subalbidus (white chanterelle) 
appears somewhat more limited in that no collection 
records thus far have indicated association with Abies 
or Picea. Recent research suggests that collections 
identified as yellow chanterelles may in fact represent 
at least two distinct species, with possible host and 
habitat differences (Feibelman et al. 1994; Redhead et 
al. 1997; S. Dunham personal communication). 
Forest stand age and structure may play a critical 
role in species occurrence (Crites and Dale 1998). Dis- 
turbances such as harvesting or fire cause changes in 
fungal diversity and productivity (Amaranthus et al. 
1994; Clarkson and Mills 1994; Stendell et al. 1999). 
Studies of EMF communities suggest that species suc- 
cession occurs following wildfire (Visser 1995; Jons- 
son et al. 1999; Stendell et al. 1999). Tree harvest also 
affects the community of forest fungi (Pilz and Perry 
1984; Colgan et al. 1999). The ROD assumes that 
listed fungi are old-growth dependent; however, 
chanterelles also may occur in younger stands. Al- 
though Danielson (1984) indicates that Cantharellus 
could not be found in a six-year-old jack pine stand, 
regenerating after wildfire, Pilz et al. (1998) found 
Cantharellus in coastal hemlock stands as young as 
twenty years in Washington. Research by S. Dunham 
(personal communication) indicates that one species 
of yellow chanterelle appears in both old-growth and 
rotation-age stands of forty-five to sixty years. An- 
other species of yellow chanterelle and the white 
chanterelle are more likely to be found in old-growth 
forests. Data for effects of fire and harvest on other 
chanterelle species are not currently available. For 
modeling purposes, however, we will assume that the 
presence of fire or harvest within approximately 
twenty years precludes the occurrence of chanterelles. 
Soil organic matter, including humus and coarse 
woody debris (CWD), especially advanced stages of 


479 — 


decay in the form of brown cubical rot, is an important 
substrate for EMF (Harvey et al. 1976) and decom- 
poser fungi. The CWD may be particularly important 
as a moisture-retaining substrate, allowing root tips to 
support active ectomycorrhizae in times of seasonal 
dryness. This is important on dry sites or following fire 
(Harvey et al. 1978; Amaranthus et al. 1989; Harmon 
and Sexton 1995). These fallen tree “reservoirs” may 
provide refugia for seedlings and mycorrhizal fungi, 
particularly in more arid forests. For dry forests in 
western Montana, Harvey et al. (1981) estimated that 
about 25-37 tons per hectare of CWD are needed to 
support ectomycorrhizal activity needed for a develop- 
ing ecosystem. Currently, no data exist for quantities of 
CWD necessary to support viability of fungi in most 
forest ecosystems of the Pacific Northwest. After dis- 
turbance, colonization of CWD by ectomycorrhizal 
fungi may be limited in the early stages of stand devel- 
opment. As stands mature, the availability of CWD 
nevertheless seems to be crucial for the establishment 
of fungi as well as seedlings (Kropp 1982; Luoma et al. 
1996). The CWD may be a good predictor of 
chanterelles, and we therefore assume that these fungi 
require a minimal volume of CWD for habitat mainte- 
nance. Our first parameter estimates will specify no 
less than 10 percent cover by CWD (Amaranthus et al. 
1994). Detailed field studies evaluating relationship be- 
tween CWD and chanterelle occurrence and produc- 
tion are now underway, and these data will be incorpo- 
rated as they become available. 

Topographic and soil factors that may be important 
to chanterelle occurrence include elevation, slope, as- 
pect, soil properties, and local microtopography. The 
survey-and-manage database indicates approximate el- 
evation ranges for each of the study fungi, and these 
are included as preliminary parameters. Factors such as 
slope, aspect, landform type, topographic moisture 
index, and soil texture are only now being investigated. 

Climate seems to be a complex factor in fungal dis- 
tribution. Seasonal and ecological distribution of fungi 
is partly determined by temperature and moisture. 
Wilkins and Harris (1946) contend that moisture may 
be the most important single environmental factor 
controlling fungal reproduction. Climatological data- 
bases are readily available from individual states. We 
know little, however, about the effects of climate on 
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long-term survival of fungi or on timing of mushroom 
formation. From the survey-and-manage records we 
do know that all our study species form mushrooms in 
fall. Norvell (1995) found that amount of precipita- 
tion does not affect chanterelle productivity in moist 
forests of western Oregon. Chanterelle productivity 
was positively correlated with average mean summer 
temperature. All other factors being equal, years in 
which the average mean summer temperature was 
high showed greater productivity. Neither precipita- 
tion nor temperature have been directly linked to oc- 
currence. In theory, however, increased productivity 
should increase our likelihood of detecting chanterelle 
occurrence. 


The Models 


Two types of models are being developed to predict 
occurrence of chanterelle species in the lowland mixed 
conifer forest type extending across the Oregon Coast 
Range and the west side of the Cascade Mountains in 
central Oregon. The first is based on ecological factors 
that contribute to fungal habitat over large areas (e.g., 
ecoregions), primarily plant association, dominant 
tree species present, forest structure, temperature, and 
precipitation. The habitat evaluation procedures 
(HEP) originally developed by the U.S. Fish and 
Wildlife Service (USFWS 19812) are being modified to 
develop habitat suitability index (HSI) scores. These 
will correspond to the probability of occurrence of a 
species and will range in value from 0 (low probability 
of occurrence - poor habitat) to 1.00 (high probabil- 
ity of occurrence - optimal habitat). A rules-based ap- 
proach, linked to a GIS in the form of queries, will 
provide predictions given current conditions of habitat 
variables as shown in composite GIS coverages for se- 
lected parameters. Figure 41.1 presents an example of 
rules we are developing for Polyozellus multiplex, the 
blue chanterelle. We will generate maps from these 
predictions to give a first approximation of where a 
particular species occurs and how likely we are to de- 
tect that species. Presumably, weedy species will gener- 
ate more and larger areas of higher HSI value than 
those that are more specific in habitat requirements. 
The second type of model being developed to examine 
microhabitat variables, operable within a particular 
forest stand. Microhabitat factors such as amount and 


IF Longitude is « 123 degrees 


AND Elevation is » 1,000 meters, « 1,500 
meters 

AND  Abies is present 

OR Pinus is present 

AND . Stand age is > 100 years 


THEN The probability of Polyozellus multiplex 
occurrence is 25 percent 


Figure 41.1. Example of rules-based modeling technique for . 
Polyozellus multiplex, the blue chanterelle. 


quality of coarse woody debris present, are used to de- 
velop rules defining the probability of occurrence for a 
given species within a limited geographic area. For this 
type of modeling, we are using an expert-system ap- 
proach in order to utilize both quantitative and quali- 
tative data. 


Validation and Accuracy Assessment 


The Pacific Northwest has an abundance of experts on 
fungi and on chanterelles in particular. Among Ore- 
gon State University, USDA Forest Service Pacific 
Northwest Research Station, and other nearby agen- 
cies and institutions, about thirty mycologists will 
voice an opinion on model accuracy. Members of the 
North American Truffling Society, the Oregon Myco- 
logical Society, and commercial mushroom harvesters 
will contribute knowledge gained from many years of 
data collection and observations that will be invalu- 
able in model development, although these individuals 
are often secretive about their picking territories. 

Ground surveys will also be employed to determine 
accuracy of predictions. By visiting areas of both high 
and low probability of occurrence, we will quantify er- 
rors of omission (detection where not predicted) and 
errors of commission (failure to detect where predicted 
to occur). Concurrently, environmental factors will be 
measured in survey areas. These additional data will be 
incorporated into the habitat models to refine parame- 
ters and increase model accuracy. We need to keep in 
mind, however, that above-ground indicators (presence 
of mushrooms) may poorly reflect the composition of 
below-ground fungal communities (Gardes and Bruns 
1996). Therefore, ground surveys will be conducted 
several times a year over several years. 
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Habitat Modeling—Opportunity 
and Challenge 


By modeling fungal habitat, we have the opportunity 
to learn a great deal about the basic biology of these 
organisms. In our preliminary models, we are making 
major assumptions about selected fungi and their role 
in the forest ecosystem. We assume that as het- 
erotrophic organisms, fungi are directly or indirectly 
dependent on plants and plant communities. We know 
mycorrhizal fungi form obligate associations with and 
obtain nutrients from one to several living host species 
and that decomposer fungi obtain nutrition from wood 
(living and dead), leaves and twigs in the litter layer, or 
freshly burned material. Some species also may require 
distinct ecological niches, such as forest gaps or undis- 
turbed areas. Distribution of nutrient sources (plants, 
woody debris) partly determines fungal distribution. 
Fungal species have different tolerances to changing 
environment and plant communities. As plant commu- 
nities change over ecological time under the influence 
of soil, climate, topography, and organisms, and in 
present time as a result of natural catastrophes or 
human activities, fungal species composition is altered. 
Fungal species composition also influences plant com- 
munity structure, providing a complex feedback mech- 
anism (van der Heijden et al. 1998). 

Our models are conceptual and qualitative in na- 
ture, due to the current scarcity of statistically analyz- 
able data. Figure 41.2 provides a flow chart of our 
generalized modeling scheme. Rules-based and map- 
ping models are only the first step in understanding 
habitat and predicting fungal species occurrences. Ex- 
ploratory data analyses are useful in determining 
trends. Bayesian and multivariate statistical techniques 


Historical Published 
Database Literature 


Map of 
Known Locations 


Habitat 
Parameters 


Large-scale Models 
Maps of Predicted Locations 
(extent: forest type) 


Small-scale Models 
Expert Systems 
(extent: stand) 


Figure 41.2. Generalized scheme for modeling fungal habitat. 


(such as principal components analysis, correspon- 
dence analysis, cartographic and regression trees, mul- 
tiple regression) provide approaches for analyzing 
large data sets and evaluating the contribution of 
many variables to habitat (Grubb 1988; Morrison et 
al: 19999 

As knowledge gaps are identified, the challenge to 
mycologists will be to address these gaps with studies 
designed to integrate with each other and with model- 
ing efforts. Considerations in designing future studies 
include determining how temporal variability can be 
assessed and incorporated into models, and collecting 
data appropriate to the regional scale (grain and 
extent). 
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Predicting Presence/Absence of 
Plant Species for Range Mapping: 
A Case Study from Wyoming 


Walter Fertig and William A. Reiners 


he range of a plant species is shaped by its phy- 

logeny, ecological adaptations, interactions with 
other species, and historical events (Daubenmire 
1978). Detailed knowledge of a plant’s range is in- 
creasingly important for land managers and conserva- 
tion biologists. Distribution maps are one of the best 
tools for depicting information about a species’ range 
and its environmental requirements. 

Traditional range maps depict species’ distributions 
as irregular polygons (outline maps), assemblages of 
points (dot maps), or a combination of both (Brown 
and Lomolino 1998). Outline maps are typically 
drawn by hand based on expert knowledge. The as- 
sumptions of the mapper are rarely stated (and thus 
are difficult to test), and such maps usually overesti- 
mate the actual range of a species, particularly for 
large areas. Dot maps show documented species loca- 
tions but may represent only a fraction of the species’ 
actual range or may reflect sampling bias (Brown and 
Lomolino 1998). Combination outline and dot maps 
may be an improvement but are still likely to overesti- 
mate distribution over large geographic areas. 

The development of geographic information sys- 
tems (GIS) has revolutionized the art and science of 
range mapping. GIS, in conjunction with large envi- 
ronmental data sets and computerized geostatistical 
methods, allows models of the interactions between a 
plant and its environment to be incorporated into the 


mapping process (Franklin 1995). Model-based range 
maps are superior to traditional ones because the as- 
sumptions of the map are explicit and readily testable 
with new data. 

Plant distributions can be modeled mechanistically 
or empirically. Mechanistic models employ ecophysio- 
logical information about the modeled species and de- 
tailed fine-scale environmental measurements to pre- 
dict a plant’s potential range in a localized area. Such 
an approach is often difficult for most vascular plant 
species over geographic areas exceeding hundreds to 
thousands of kilometers because the requisite ecophys- 
iological studies have not been conducted and fine- 
grained environmental data at these scales are often 
unavailable. Models based strictly on ecophysiology 
may also exclude the effects of competition and other 
biotic interactions, thereby generating a model of a 
species’ potential niche rather than its realized niche. 

Empirical (or correlational) models are based on 
correlations between selected environmental variables 
that directly influence a species or are surrogates for 
direct gradients (Austin et al. 1990; Franklin 1995). 
The resulting environmental envelope can then be 
used to generate a potential range map that approxi- 
mates the species’ realized niche. An important limita- 
tion of empirical models is that causal factors are not 
determined, and so these models will be less successful 
than mechanistic models at predicting range shifts due 
to changes in climatic or biotic variables. Empirical 
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modeling may also be inadequate for rare species 
whose distributions do not reflect the full extent of 
their realized niche because of incomplete dispersal, 
recent origin, localized extinction, or historical acci- 
dents. Despite these limitations, empirical modeling 
can be an efficient technique for predicting ranges of 
relatively common species over scales of hundreds to 
thousands of square kilometers. 

To date, most plant modeling studies have been ap- 
plied to localized areas ranging in size from 1-square- 
meter plots (Wiser et al. 1998) to regions up to 70,000 
square kilometers (Franklin 1998). These studies have 
focused on individual species (Davis and Goetz 1990; 
Guisan et al. 1998), guilds of species (Franklin 1998; 
Wiser et al. 1998), or plant communities (Lees and 
Ritman 1991; Moore et al. 1991; and reviewed by 
Franklin 1995). Those modelers studying larger spa- 
tial extents (regional or subcontinental scales exceed- 
ing several thousand square kilometers) have prima- 
rily addressed potential effects of climate change on 
distributions of ecologically important species or com- 
munities (Brzeziecki et al. 1995; Huntley et al. 1995; 
Iverson and Prasad 1998; Iverson et al. 1999). 

We describe a prototype empirical modeling proce- 
dure for individual plant species centered in the state 
of Wyoming. Unlike many of the preceding studies, 
our approach covers a larger geographic area (approx- 
imately 252,000 square kilometers) and utilizes state 
and regionwide digital environmental and herbarium- 
based plant location data rather than localized, plot- 
derived environmental and presence/absence data sets. 
Our state-scale predictive models will ultimately be 
used for a gap analysis of selected elements of the vas- 
cular flora of Wyoming (Fertig et al. 1998) but could 
also be used as a baseline for studying future range 
shifts due to climate change or to identify new areas 
to survey for plants of management or conservation 
interest. 


Methods 


The vascular flora of Wyoming contains more than 
2,700 native and introduced taxa, making the model- 
ing of all species for range mapping impractical. We 
have randomly selected two hundred plant taxa that 
represent a cross-section of growth form, abundance, 


biome affinity, and geographic distribution patterns in 
the flora and that were represented by at least twenty- 
five known locations in Wyoming (Fertig et al. 1998). 
We have chosen one of these target species for our 
pilot study to demonstrate the suitability of different 
statistical modeling approaches that will then be ap- 
plied to the remaining study pool. 

Mentzelia pumila (Nutt.) T. & G. var. pumila 
(Loasaceae), a short-lived perennial forb restricted to 
semibarren desert plains and slopes, occurs from 
south-central Montana and western North Dakota to 
central Wyoming, northeastern Utah, and northwest- 
ern Colorado. Based on collections at the Rocky 
Mountain Herbarium (RM) and reports from Hill 
(1975), M. pumila is known from approximately fifty 
locations in Wyoming (Fig. 42.1). Only thirty-five of 
these locations were chosen for modeling because they 
had sufficiently precise label data to place them within 
0.5-2.5 kilometers of their presumed collection site. 

Absence data have not been routinely collected for 
M. pumila, or virtually any other plant species, on a 
statewide basis. As a surrogate for confirmed absence 
points, we used the RM's database of over nine thou- 
sand sampling localities to identify areas where this 
species has not been collected (and is thus presumed 
absent). Since 1977, RM researchers have systemati- 
cally established these sampling sites across the entire 
state of Wyoming for the purpose of collecting all 
plant taxa present in the immediate area (Hartman 
1992). We stratified these putative absence sites by 
their environmental attributes and randomly selected 
1,270 points that would reflect the full range of varia- 
tion in each of the environmental variables selected for 
model development. By using such a large data pool, 
we were able to produce an approximately uniform 
distribution of absence location points across the 
state, although denser concentrations of points are 
present in mountainous areas with steep environmen- 
tal gradients (Fig. 42.2). Any presumed absence sites 
that overlapped with known presence locations were 
removed from the final dataset. 

We selected climatic, topographic, and edaphic 
variables for modeling based on their utility in de- 
scribing the environmental space occupied by M. 
pumila and their availability in statewide digital for- 
mat. In some cases, selected environmental variables 
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Figure 42.1. Dot distribution map of the known distribution of 
Mentzelia pumila in Wyoming based on records of the Rocky 
Mountain Herbarium. 


(such as terrain position or bedrock geology) have no 
direct physiological effect on this species but serve as 
useful surrogates for those factors that do have an in- 
fluence (such as microclimate, soil texture, or soil pH). 

For climate data, we used PRISM mean monthly 
precipitation available in 4-kilometer raster format 
(Daly et al. 1994, 1997) and unpublished PRISM 
mean monthly temperature data available in 2-kilome- 
ter raster format (Ken Driese personal communica- 
tion). Topographic data, including elevation, slope, 
and aspect, were calculated from 30-meter digital ele- 
vation model (DEM) coverages for the state. An index 
of landscape position for each 30-meter pixel was cal- 
culated following the protocol of Fels and Matson 
(1996) and then reclassified into four terrain-position 
categories based on overall slope and shape (concave, 
slope, flat, and convex) (Ken Driese personal commu- 
nication). We also used vector coverages of bedrock 
geology (Love and Christiansen 1985) and family- 
level soil type (Munn and Arneson 1998) derived from 
1:500,000-scale maps. Bedrock geology units of com- 
parable age and mineralogy were aggregated to form a 
coverage that approximated the statewide geologic 
highway map (Geological Survey of Wyoming 1991). 
Finally, land use and general vegetation type were de- 
rived from the 1:500,000-scale Wyoming gap land- 
cover map (Driese et al. 1997). Values for all environ- 
mental attributes were assigned to each presence and 


Figure 42.2. Dot distribution map of putative absence sites 
for Mentzelia pumila in Wyoming. 


absence point. To minimize potential errors based on 
imprecise locations, we derived terrain position 
from herbarium specimen labels for known presence 
locations. 

Potential distribution models were constructed 
from our environmental data sets using logistic regres- 
sion (Minitab, version 11) and classification tree 
analysis (S-plus, version 1.1) (Breiman et al. 1984; 
Hosmer and Lemeshow 1989). For both techniques, 
our data set was randomly divided into nineteen pres- 
ent and 626 absent locations for model building and 
sixteen present and 644 absent locations for model 
testing. In each model, presence and absence were 
chosen as response variables, and selected environ- 
mental attributes were used as predictors. 

For the logistic regression analysis, we chose pre- 
dictors based on their statistical significance in trial 
runs (P « 0.05). We selected the simplest model with 
the best goodness of fit for mapping in GIS (ArcView, 
version 3.1) at a resolution of 30 meters across the 
state domain. Using ArcView's map-calculator func- 
tion, we determined the probability of presence or ab- 
sence of M. pumila for each 30-meter pixel in the 
state. We then selected a cut-off probability for pres- 
ence that represented the midpoint between the mean 
probabilities of the known present and absent points 
(Fielding and Haworth 1995). Points above this value 
were designated present in our final map, while those 
below were deemed absent. 
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In our classification tree model, we used box plots 
to visually identify the environmental predictors that 
best discriminated between presence and absence of 
M. pumila. We developed several initial classification 
tree models using six environmental predictors (ter- 
rain position; bedrock geology; July mean tempera- 
ture; and January, April, and July mean precipitation) 
and a cut-off of five observations per terminal node 
beyond which no additional splitting occurred. For 
mapping, we selected the model with the fewest num- 
ber of terminal nodes (14) and highest residual mean 
deviance (0.081). This model used only four of the six 
predictor variables (April mean monthly precipitation, 
terrain position, bedrock geology, and January mean 
precipitation) and was pruned to predict the presence 
of M. pumila at four terminal nodes. For each of these 
nodes, we developed a predicted range map in Arc- 
View based on the intersection of their composite en- 
vironmental attributes. These individual layers were 
then merged to form the final statewide predicted 
range map at a resolution of 30 meters. 


Results 


Our logistic regression model was constructed with 
four environmental variables: April mean precipita- 
tion, terrain position, July mean temperature, and ele- 
vation. This model was strongly significant (G = 59.1, 
df = 4, P = 0.000) with good fit (Pearson deviance chi- 
square = 180.0, df = 607, P = 1.00). The multiple lo- 
gistic regression equation was 


In(Y / 1 — Y) = 23.04 - 1.0424 (terrain position) + - 
0.002966 (elevation) + 0.03252 (July mean temperature) 
— 0.0015948 (April mean precipitation). 


Of the four predictors, April precipitation had the 
strongest influence on the model (Z = -4.34, P = 
0.000), followed closely by terrain position (Z - 
—3.29, P = 0.001), July mean monthly temperature (Z 
252.33 9pNe0:020), and clevationn 299p» 
0.036). 

The model successfully classified thirteen (68.4 
percent) of the known present points and 576 (92 per- 
cent) of the known absent points in the modeling data 
set, for an overall correct classification rate of 92 per- 
cent (Table 42.1). For the validation data set, the 


model correctly predicted eight (50 percent) of 
the known present points and 573 (89 percent) of the 
known absent points for an overall success of 88 per- 
cent (Table 42.1). The distribution map produced 
from this model (Fig. 42.3) predicted presence for this 
species over an area of 49,482 square kilometers. Fig- 
ure 42.4 illustrates the probability of occurrence at 
each pixel. 

The final classification tree model selected April 
mean monthly precipitation as the most important 
predictor for the initial subdivision of the model- 
building data set followed by terrain position, geo- 
logic substrate, and January mean monthly precipita- 
tion. The model predicted likely presence of M. 
pumila under four different sets of environmental con- 
ditions (Table 42.2). The map derived from this model 
(Fig. 42.5) predicted a total statewide range of 15,962 
square kilometers. 

This model correctly classified seventeen of the 
known presence points (89.5 percent) and 562 of 
the known absence points (89.8 percent) from the 
modeling data set for a total classification success rate 
of 89.8 percent (Table 42.1). From the validation data 
set, the model correctly classified nine of the known 
present points (56.3 percent) and 548 (85 percent) of 
the known negative points (Table 42.1). 


Model Performance 


Both the logistic regression and classification tree 
models had overall classification success rates of 
89.8-92 percent for the model-building data sets and 
84.4—88 percent for the validation data set. Fielding 
(Chapter 21), however, has suggested that the overall 
rate of classification success is a poor indicator of pre- 
diction accuracy because this measure does not distin- 
guish between errors of omission (missed present 
points) and commission (false positives), or weight 
these errors equally, when in most situations missed 
present points have more serious management conse- 
quences. The omission error rate was significantly 
higher (31.6 percent) for the model-building data set 
in our logistic regression model than in the compara- 
ble data set for the classification tree model (10.5 per- 
cent), (Table 42.1) although the commission error 
rates for both were similar (8 percent in the logistic re- 
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TABLE 42.1. 


Comparison of classification success rates for logistic regression and classification tree models. 


Logistic regression model 
Model data set 


Logistic regression model 
Validation data set 


Model 
present 


Model 
absent 


Model 
present 


Model 
absent 


Known present 13/19 (68.496) 6/19 (31.6%) 


Known absent 50/626 (8%) 576/626 (92%) 


Known present 8/16 (50%) 8/16 (50%) 


Known absent 71/644 (11%) 573/644 (89%) 


Classification tree model 
Model data set 


Classification tree model 
Validation data set 


Model 
present 


Model 
absent 


Model 
present 


Model 
absent 


Known present 17/19 (89.5%) 2/19 (10.5%) 


Known absent 64/626 (10.2%) 562/626 (89.8%) 


Known present 9/16 (56.2%) 7/16 (43.8%) 


Known absent 96/644 (15%) 548/644 (85%) 


gression model and 10.2 percent for the classification 
tree model). Error rates were also comparable in both 
models for the validation data sets, although in this 
case, the logistic regression model performed slightly 
better in terms of commission errors. 

Variation in prediction success rates may be related 
in part to differences in the choice of predictor vari- 
ables between the two models. Both utilized April 
mean precipitation and terrain position, but only the 
classification tree model distinguished between levels 


Figure 42.3. Predicted distribution of Mentzelia pumila in 
Wyoming based on a logistic regression model. Black area = 
predicted present, white area = predicted absent or no data. 


of precipitation (selecting values between 27.25 and 
33 millimeters) or terrain values (selecting only slopes 
or swales). Likewise, the classification tree model also 
selected categories or ranges of values for bedrock ge- 
ology and January mean precipitation in model devel- 
opment, but the logistic regression model made no dis- 
tinction between values of elevation or July mean 
temperature. The higher omission error rates in the lo- 
gistic regression model may result from that model 
being too simplistic or from using parameters that are 
too conservative (Fielding, Chapter 21). 

Differences in prediction success rates may also re- 
flect methodological differences between the two mod- 
eling techniques. The logistic regression model uses a 
single regression equation to identify the optimal con- 
dition under which this species is present, whereas the 
classification tree model identifies multiple conditions 
under which M. pumila may occur. The environmental 
attributes used by the four pathways in the latter. 
model are quantifiable and readily testable in the field 
(Franklin 1995), whereas the coefficients of the logis- 
tic regression equation are difficult to interpret. 
Logistic regression models are better suited for assess- 
ing probability of occurrence on a per-pixel basis than 
classification tree models (Dettmers et al., Chapter 54) 
but may be more prone to overestimation of potential 
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TABLE 42.2. 


Combinations of environmental attributes used to model Mentzelia pumila. 


Group Mean precipitation Terrain position 


Bedrock geology 


Late Paleozoic/early Mesozoic sediments, Middle Cretaceous shales 


or Miocene sandstone/conglomerates 


dl April: « 33 mm Slopes or swales : 

2 April: « 27.25 mm Slopes or swales 
January: 7.25-8.85 mm 

3 April: 27.25-33 mm Slopes or swales 
January: < 12.53 mm 

4 April: < 33 mm Slopes or swales 


January: 12.53-16.1 mm 


Upper Cretaceous shales, late Eocene interbedded sandstones, or 
Quaternary alluvium 

Upper Cretaceous shales, late Eocene interbedded sandstones, or 
Quaternary alluvium 

Jurassic/early Cretaceous sediments, Lower Cretaceous shales, 
Upper Cretaceous shales, Tertiary intrusive volcanics, Paleocene in- 


terbedded sandstones, late Eocene interbedded sandstones, Quater- 
nary alluvium, Quaternary lacustrine deposits, or Quaternary till 


range if cutoff probabilities are low (Fielding, Chapter 
235: 


Sources of Error 


Errors in classification success may also be affected by 
imprecision in the plant location data set. Location 
points have an error range of 0.5 to 2.5 kilometers, 
depending on the label precision of the original speci- 
mens. Present locations may be ruled as absent if they 
fall within this error range on the projected range 
map. Including a buffer around each present and ab- 
sent point could improve prediction accuracy. In the 
case of M. pumila, however, we found no difference 
between prediction success for present points with a 
buffer of 2.5 kilometers and unbuffered points. 

Some error may derive from mistakes inherent in 
the environmental data sets. The conversion of small- 
scale cartographic maps to digital format may intro- 
duce spatial errors of up to 1 kilometer (Munn and 
Arneson 1998). DEMs frequently contain errors that 
can become amplified when they are used to derive 
slope, aspect, and terrain position, as we have done 
(Davis and Goetz 1990; Franklin 1995). Precipitation 
and temperature data derived by averaging values 
across each grid cell have a spatial precision no better 
than one-half the resolution of the cell (Daly et al. 
1997). Due to the errors in all of these variables, digi- 
tal versions of our modeled range maps should not be 
used at a scale below 2 kilometers. 


Other sources of omission error in our models in- 
clude an inadequate number of location points for M. 
pumila, the lack of digital data sets for important bi- 
otic factors (such as distribution of pollinators or seed 
dispersal vectors), and spatial autocorrelation. Com- 
mission errors may result from stochastic events, his- 
torical accidents, or incomplete dispersal into other- 
wise suitable environments (Fielding, Chapter 21). 


Management Implications 


Although the two models produced range maps en- 
compassing similar regions of the state (Figs. 42.3 and 
42.5), the logistic regression model predicted a total 
area nearly three times larger than the classification 
tree model. This disparity in size is probably due to 
the low cut-off level (6 percent) used to assign the 
probability of presence/absence to each pixel in the lo- 
gistic regression model. When this map was recalcu- 
lated based on higher probability cutoffs (greater than 
25 percent), the resulting map depicts a smaller pre- 
dicted range than the classification tree map does (Fig. 
42.4). It is interesting to note that both modeling 
methods predicted the potential occurrence of M. 
pumila in northeastern Wyoming (Weston County), an 
area where this species has not previously been docu- 
mented (Figs. 42.1, 42.3, 42.5). Under either mapping 
scenario, M. pumila is probably more widespread in 
Wyoming than current dot maps would suggest (Fig. 
42.1). 
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Figure 42.4. Probability of occurrence of Mentzelia pumila in 
Wyoming based on a logistic regression model. Black area = 
probability of occurrence greater than 50 percent, gray 
area = probability of occurrence 25-50 percent, white area = 
probability of occurrence less than 25 percent, or no data. 


Range maps based on correlative models are useful 
for land managers and conservation biologists faced 
with the need to identify potential locations for rare or 
economically important plant species while con- 
strained by limited budgets, time, or manpower. These 
models can be used to prioritize areas for survey or to 
identify potential preserves or reintroduction sites 
(Elith and Burgman, Chapter 24), to perform gap 
analyses (Fertig et al. 1998), or to determine baseline 
conditions for climate-change studies (Iverson et al. 
1999). More important, the construction of these pre- 
dictive maps is becoming increasingly easy and afford- 
able as GIS and geostatistical programs improve and 
large-scale environmental and herbarium point- 


location data sets become more readily available. 


Figure 42.5. Predicted distribution of Mentzelia pumila in 
Wyoming based on a classification tree model. Black area = 
predicted presence, white area = predicted absence. 
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A Model to Predict the Occurrence of 
Surviving Butternut Trees in the 
Southern Blue Ridge Mountains 


Frank T. van Manen, Joseph D. Clark, Scott E. Schlarbaum, 
Kristine Johnson, and Glenn Taylor 


utternut, or white walnut (Juglans cinerea), is a 

highly valued hardwood species native to eastern 
North America. The tree is closely related to black 
walnut (Juglans nigra) and occurs on cove hardwood 
sites, but it can also grow on poorer and drier sites. 
The wood of butternut is used for veneer and lumber, 
and the mast is eaten by a variety of wildlife species. 

Many butternut populations are currently being 
decimated by an exotic fungus, Sirococcus clavigig- 
nenti-juglandacearum. This fungus causes multiple 
branch and stem cankers with a characteristic black 
color. Main stem cankers eventually girdle the tree re- 
sulting in death. The disease was discovered in 1967 in 
southwestern Wisconsin (Renlund 1971), although it is 
believed to have first arrived on the eastern coast of the 
United States at least forty to sixty years ago (Ander- 
son and LaMadeleine 1978). The canker has spread 
throughout most of the species’ range, with greatest 
impacts on southeastern populations, where an esti- 
mated 77 percent of the butternuts are dead (USDA 
1995). Because of this rapid decline of butternut popu- 
lations, the U.S. Fish and Wildlife Service is currently 
reviewing the status of the species. 

Surviving butternut trees in the southern Blue Ridge 
Mountains are primarily found in close proximity to 
streams. Because the disease can affect young trees 
and even seeds, genetic material of infected popula- 
tions can be permanently lost. No effective strategies 


exist to protect butternut trees from the canker. How- 
ever, putatively resistant trees have been located on the 
Daniel Boone National Forest in Kentucky and the 
Pisgah National Forest in North Carolina. If resist- 
ance is confirmed, a breeding program could develop 
disease-resistant trees. This strategy could eventually 
be used to return butternut to the southern Blue Ridge 
Mountains, but this would be predicated upon the 
ability to transfer resistance using germplasm with a 
sufficiently broad genetic base. Therefore, the ability 
to efficiently locate enough canker-resistant stock for 
evaluation and subsequent breeding is essential. At 
present, resistant stock is slowly being located by ex- 
tensive searches, but that requires substantial time and 
resources. 

Predicting the occurrence of organisms requires 
knowledge of resource conditions that lead to occu- 
pancy by a particular species. Some indicators of habi- 
tat where butternut is likely to occur can be discerned 
based on field experience, but other complex ecologi- 
cal relationships cannot. Moreover, habitat associa- 
tions consistent with butternut occurrences can be eas- 
ily overlooked in areas where access is prohibited, 
when field surveyors have limited experience or train- 
ing, or when occurrence does not fit the surveyor's es- 
tablished *search image." GIS-based modeling, using 
multivariate statistics, can be an efficacious approach 
to predicting species occurrence across relatively large 
areas (Scott et al. 1993). A reliable model of butternut 
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occurrence would enable managers to identify priority 
areas to survey for putatively resistant trees. Addition- 
ally, a spatial modeling approach may provide insights 
into habitat conditions and processes contributing to 
survival of butternut trees exposed to the fungus. Be- 
fore such a model should be used, however, a thor- 
ough evaluation of its reliability is necessary. The ob- 
jectives of this study were to develop a GIS-based 
occurrence model using habitat data with locations of 
surviving butternut trees and to test model predictions 
with independent field data. 


Methods 


We used 134 butternut locations from Great Smoky 
Mountains National Park (hereafter, GSMNP) to de- 
velop the model (Fig. 43.1). These locations were col- 
lected from 1990 to 1997 and compiled by Natural 
Heritage programs administered by North Carolina, 
Tennessee, and GSMNP. These 134 locations repre- 
sented the entire database of known butternut trees 
within GSMNP. Butternut locations were identified by 
National Park Service botanists during surveys and in- 
ventories for other species throughout GSMNP, and, 
as such, should represent a relatively unbiased sample. 
We used a GIS database of eleven variables (cover- 
ages) to characterize habitat conditions of these but- 
ternut locations (Table 43.1). These variables were se- 
lected based on documented habitat associations of 
the species, field experience of botanists, and earlier 
experience developing models for other plant species. 
All GIS coverages were continuous variables based on 
92.9x92.9-meter pixels. 

Mabalanobis distance modeling. Mahalanobis dis- 
tance (D2) is a multivariate statistic that describes a 
measure of dissimilarity (Rao 1952). This statistic, 


D? = (x - #) X3 (x- û) 


was calculated for each pixel in the GIS coverage of 
the study area by combining the information from the 
eleven habitat coverages, where x is a vector of habitat 
characteristics for each cell in the GIS grid, £ is the 
mean vector of habitat characteristics of the original 
sampling points, and X! is the inverse of the variance- 
covariance matrix calculated from the sampling 
points. This statistic represents the standard squared 


distance between a set of sample variates, x, and 
“ideal” habitat represented by £. Thus, low values in- 
dicate conditions similar to those of the original sam- 
pling points, with greater D2 values indicating increas- 
ingly dissimilar conditions. Mahalanobis distance is 
dimensionless because it is a function of standardized 
variables, despite the different measurement scales 
among the original variables. There is no one *best" 
combination of variables that results in the lowest D? 
values; a variety of habitat combinations can result in - 
identical distance values (Clark et al. 1993b). 

We calculated Z and Y-1 with SAS (SAS Institute 
1990a) based on the habitat characteristics of the 134 
sample locations in GSMNP. Subsequently, we used 
this information to calculate D2 in ArcInfo GRID 
(ESRI, Redlands, Calif.) for a 31,235-square-kilome- 
ter area in and around GSMNP; we refer to this area 
as the southern Blue Ridge Mountains (Fig. 43.1). 
This area was within the species’ range (Rink 1990) 
and included counties where butternuts were surveyed 
for resistance during 1997. We used the Southern Ap- 
palachian Assessment database (Hermann 1996), a 
multiagency cooperative effort to assess and map the 
region’s natural resources, to generate a GIS database 
for this area with the same eleven variables used to 
characterize the butternut sites upon which the model 
was based (Table 43.1). The pixel size of the regional 
GIS coverages also was set to 92.9x92.9 meters and 
D? was calculated for each pixel. 

To test the hypothesis that the probability of en- 
countering butternut increases with decreasing values 
of D2, we generated 130 random sample points within 
the southern Blue Ridge Mountains (Fig. 43.1), but 
only in areas less than 500 meters from improved 
roads on publicly owned land. We restricted model 
testing to this smaller portion of the study area be- 
cause of rugged terrain and the logistical difficulties of 
acquiring permits to access private property. We as- 
sumed the test locations were representative because 
the distribution of D2 values of the test locations did 
not differ from the distribution for the entire southern 
Blue Ridge region (asymptotic Kolmogorov-Smirnov 
SETHSECIENIC 20" 0164) 

Test locations were visited during spring 1999 and 
were coded so that field personnel had no knowledge 
of model predictions for that site. Fourteen of the 130 


TABLE 43.1. 


43. A Model to Predict the Occurrence of Surviving Butternut Trees 


498 


Geographic information system variables used to calculate Mahalanobis distance values in the southern Blue Ridge Mountains 
based on butternut (Juglans cinerea) locations in Great Smoky Mountains National Park (GSMNP), 1990-1997. 


Variable Description Value Range? Source 
Aspect Aspect transformed: 0.0-2.0 Calculated from aspect (Hermann 1996) based 
1 + cos(45 - aspect) on Beers et al. (1966) 
Elevation Elevation (m) 308-1,470 U.S. Geological Survey digital elevation model 
from Hermann (1996) 
Proximity to Proximity to nearest stream (m) 0-956 Calculated from streams coverage (Hermann 
streams 1996) with the EUCDISTANCE Command (Arcinfo 
GRID) 
Planform curvatureb Slope curvature in horizontal -0.21-0.35 Calculated from elevation with the CURVATURE 
plane (divergence and command (ArcInfo GRID) 
convergence of water flow) 
Profile curvature Slope curvature in vertical plane -0.19-0.31 Calculated from elevation with the CURVATURE 
(acceleration and deceleration of command (ArcInfo GRID) 
water flow) 
Relative slope Relative slope position (%) 0-100 Calculated from elevation based on Wilds (1996) 
position 
Slope Slope steepness (degrees) 1-26 Calculated from elevation with the SLOPE com- 
mand (ArcInfo GRID) 
Solar insolation Index of exposure to sunlight; 188-378 Calculated from elevation with the HILLSHADE 
approximated for the solar command (ArcInfo GRID) 
equinox 
Topographic Shannon-Wiener index of 18.7-33.8 Calculated based on Miller (1986) 
complexity topographic complexity considering 
elevation, aspect, and slope 
Topographic Simulates the flow accumulation of - 1.0-13.2 Calculated based on Beven and Kirkby (1979), 
convergence index ^ water; TCI = In(A/tan B), where Wolock and McCabe (1995), and Halpin (1995) 
A is drained surface area and 
B is drained surface slope 
Topographic relative Index of moisture considering the 13-59 Calculated based on Parker (1982) 


moisture index 


effects of slope position, aspect, 
and elevation 


aValue ranges are based on the 134 butternut locations sampled in GSMNP. 
bNegative planform curvature indicates an upwardly concave curvature perpendicular to the direction of the slope; positive planform curvature 
indicates an upwardly convex curvature perpendicular to the direction of the slope. 
*Negative profile curvature indicates an upwardly concave curvature of the surface in the direction of the slope; positive profile curvature indicates 
an upwardly convex curvature of the surface in the direction of the slope. 


test plots were in GSMNP and were located with a 
global positioning system (GPS) (PLGR+96, Rockwell 
International, Cedar Rapids, Iowa) with unassisted 
military Y-code signal (3.5-meter accuracy). The re- 
maining 116 test plots were located with a Garmin 12 
XL GPS receiver (Navtech GPS, Alexandria, Va.). Be- 
cause we could not achieve real-time differential cor- 
rections during location of these 116 test plots, this 


accuracy was subjected to the U.S. Department of De- 
fense accuracy degradation up to 100 meters. Once 
each test point was identified, an area of 92.9x92.9 
meters centered on this point was sampled for pres- 
ence or absence of butternut trees. Location coordi- 
nates of butternut trees encountered in the vicinity 
(less than 500 meters) of the test plots also were 
recorded. 
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Figure 43.1. Locations of butternut (Juglans cinerea) trees in Great Smoky Mountains National Park used for model development 
(1990-1997) and locations of plots for model testing in the southern Blue Ridge Mountains (1999). 


To assess model validity, we determined presence or 
absence of butternut trees for each test plot and used 
these observations as the dependent binomial variable 
in a logistic regression (Hosmer and Lemeshow 1989). 
We chose logistic regression to determine the relation- 
ship between butternut occurrence and the D2 values 
of the pixels corresponding to the test plots. We tested 
all models for the assumption of linearity in the logit 
by plotting the D? values against associated logit val- 
ues. We tested the fit of the model by use of the 
Hosmer-Lemeshow goodness-of-fit test (Hosmer and 
Lemeshow 1989). To determine whether statistical re- 
lationships were affected by scaling, we calculated 
mean D? values for areas surrounding the pixels of the 
test plots. We calculated this mean value based on 


square “windows” of 279, 465, 650, 836, and 1,022 
meters on a side. 


Results 


We calculated D2 values for each pixel in the southern 
Blue Ridge Mountains based on the model developed 
from the original 134 butternut locations; these values 
ranged from 1.0 to 2,648.6, with a mean of 33.1 (sd = 
31.7; see Fig. 43.2 in color section). D2 values corre- 
sponding to the 134 model input positions (i.e., 
known butternut locations in GSMNP) ranged from 
2.5 to 110.3 (mean = 11.2, median = 8.4, sd = 11.4) 
and the regional test points had D2 values ranging 
from 3.1 to 366.2 (mean = 19.8, median = 13.2, sd = 
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TABLE 43.2. 


Estimated parameters (coefficients) of a logistic regression model to determine the relationship between 
Mahalanobis distance values and presence or absence of butternut (Juglans cinerea) in 130 test plots, southern 


Blue Ridge Mountains, 1999.4 


Variable Parameter estimate Standard error Wald x? P > x26 Standardized estimate 
Intercept —0.037 0.625 0.004 . 0.952 
Mahalanobis 

distance 0.162 0.060 1.322 0.007 -3.001 


aModel statistics: Hosmer-Lemeshow goodness-of-fit statistic = 11.11, 8 df, P = 0.196; maximum rescaled R2 = 0.193. 
bP-value indicating the probability of a greater value based on the Wald x? statistic. 


33.6). Sixteen of the 130 test plots contained butter- 
nut trees. Seventy-five percent of the butternut occur- 
rences were in test plots with D2 values less than 10.0, 
and 94 percent were in pixels with values less than 
13.6 (Fig. 43.3). Conversely, the proportion of test 
plots containing butternut declined with increasing 
values of D2: six out of nine (0.67) of the plots with 
values less than 5.0, five out of thirty-eight (0.133) 
with values = 5.0 and less than 10.0, four out of 
thirty-seven (0.11) with values = 10.0 and less than 
20.0, and one out of forty-six (0.02) with values 2 
20.0 (Fig. 43.2). The logistic regression analysis of the 
test data indicated a significant negative association 
between D2 and presence of butternut (parameter esti- 
mate = —0.162, P = 0.007) and explained 19.3 percent 
of the variation (Table 43.2). The model fit the data 
(Hosmer and Lemeshow goodness-of-fit statistic = 
11.1, 8 df, P = 0.196) and was linear in the logit. The 


TABLE 43.3. 


Estimated parameters (coefficients) of a logistic regression 
model to determine the relationships between Mahalanobis 
distance values and presence or absence of butternut 
(Juglans cinerea) in areas of different sizes surrounding 130 
test plots, southern Blue Ridge Mountains, 1999. 


Window size (m) Parameter estimate pa 

93 x 93 —0.162 0.007 
279 x 279 —0.082 0.059 
465 x 465 —0.043 0.197 
650 x 650 -0.016 0.525 
836 x 836 —0.002 0.859 
1,022 x 1,022 0.001 0.918 


aP.yalue indicating the probability of a greater value based on the Wald 
x2 statistic. 


relationship between D2 and butternut occurrence was 
not significant for mean D? values in windows 2 279 
meters (Table 43.3). 


Discussion 


Our analysis indicates that the Mahalanobis distance 
statistic can be an effective measure to delineate po- 
tential butternut habitat. For the southern Blue Ridge 
Mountains, the presence of butternut trees was more 
likely with decreasing values of D2. However, this re- 
lationship explained only a small portion of the varia- 
tion (19.3 percent); other factors that we did not in- 
corporate in the model likely contributed to the 
presence or absence of butternut. The high false- 
positive rate shows that low D? values do not neces- 
sarily indicate species presence. However, one would 
not expect that species always occupy all suitable 
habitats. Because of dispersion limitations, competi- 
tion, and other life-history phenomena, its absence at 
one point in time does not imply that the species was 
absent in the past or that it will not be present in the 
future (Andrewartha and Birch 1954; Hanski and 
Simberloff 1997). In addition, a possible confounding 
factor is the decline of butternut populations. This de- 
cline may result in species absence despite ideal envi- 
ronmental conditions. 

When evaluating model performance, spatial er- 
rors, such as locational errors associated with the orig- 
inal butternut records, GIS mapping errors, and GPS 
errors should be considered (Fertig and Reiners, 
Chapter 42). Although these errors are unlikely to re- 
sult in biases, they may affect the power to test for 
model performance. For example, test plots likely did 
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Figure 43.3. Cumulative frequency of butternut (Juglans cinerea) presence in test plots and associated Mahalanobis distance val- 


ues, southern Blue Ridge Mountains, 1999. 


not exactly coincide with the targeted pixel but 
instead were at least partially located within neighbor- 
ing pixels. Despite the probable presence of such er- 
rors, however, we found that D? values successfully 
predicted butternut occurrence. 

The data we used to develop the model came from 
a small spatial subset of the area to which predictions 
were made. Therefore, one primary concern is 
whether a model based on such data can be appropri- 
ately extrapolated to larger areas. We devised the in- 
dependent test to specifically address this issue, and 
we conclude that our model was a good descriptor of 
butternut habitat in the southern Blue Ridge Moun- 
tains. Furthermore, the median value of Mahalanobis 
distance was similar for the original GSMNP locations 
(median = 8.4, n = 134) compared with the test loca- 
tions where butternut was found elsewhere in the re- 
gion (median = 6.7, n = 13). 

Butternut occurrence did not show significant rela- 
tionships with mean D? values calculated for square 


areas 2 279 meters on a side centered on the targeted 
test pixel; with increasing area of the squares, the rela- 
tionship tended toward a random model. We conclude 
that the model predictions were site specific and little 
affected by surrounding habitat conditions. A model 
developed using finer-grained habitat attributes than 
we used might be more accurate than ours is. 

Many modeling efforts have not employed field 
tests to evaluate model predictions. Although various 
quasi-validation procedures have been used (e.g., jack- 
knifing or bootstrapping), they do not eliminate possi- 
ble biases in the collection of the original samples. We 
submit that the original butternut locations may not 
represent a completely unbiased sample. However, the 
field tests we conducted were independent of the obser- 
vations used to develop the model and allowed us to 
more accurately quantify model performance, which 
can then be used to improve the model. These inde- 
pendent test results seem to indicate that the original 
butternut locations were relatively unbiased. 
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Six of nine (67 percent) test plots with D2 values 
less than 5.0 contained butternut trees. Thus, when 
visiting areas represented by these pixels, resource 
managers can expect a relatively high probability of 
finding butternuts. Because these areas can easily be 
identified from the digital map (Fig. 43.2) and com- 
prise only a small portion of the study area (1.2 per- 
cent in this example), searches can be conducted more 
efficiently and areas unfamiliar to surveyors can be 
searched. Although experts could find butternuts 
based on their knowledge of habitat indicators, their 
*search images" may differ or be prejudiced, and 
other habitat areas may be overlooked. Thus, applica- 
tions of GIS technology coupled with multivariate sta- 
tistics, as shown here, can be particularly useful for 
management of species that are in decline and may 
help guide restoration efforts or assist in the identifica- 
tion of high-quality habitat. 

Biometric models of plant-habitat relationships are 
less common than wildlife habitat models, the latter 
being an active area of research in recent decades (see 
this volume and Verner et al. 1986b). In many ways, 
however, plants may be better subjects for habitat 


modeling. For example, plants are sessile and are more 
affected by microclimate, topography, and other typi- 
cal GIS-based variables than most animals. Habitat 
conditions associated with an individual plant or tree 
are relatively static, whereas animals are mobile and 
habitat characteristics at their location at any one 
point in time may not be critical to survival. The test- 
ing of animal-habitat relationships is particularly chal- 
lenging because false positive rates will likely be much 
greater than for plant models. Finally, field testing is 
more straightforward for plant models because pres- 
ence is easier to detect. 
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Predicting Meadow Communities and 
Species Occurrences in the 
Greater Yellowstone Ecosystem 


Diane M. Debinski, Mark E. Jakubauskas, Kelly Kindscher, 
Erika H. Saveraid, and Maria G. Borgognone 


or the past eight years, our research team has 

been developing predictive models of montane 
meadow communities in the Greater Yellowstone 
Ecosystem (e.g., Debinski 1996; Debinski and Hum- 
phrey 1997; Jakubauskas et al. 1998; Kindscher et 
al. 1998; Debinski et al. 1999). Our models are 
based upon landscape-level habitat analysis using ge- 
ographic information systems (GIS) and remotely 
sensed data (Urban et al. 1987; Turner 1989; Davis 
et al. 1990; Hobbs 1990; Stoms and Estes 1993; Ed- 
wards et al. 1996) for small areas (0.25-hectare map- 
ping unit), combined with field-sampled data of the 
biotic communities. We have focused on predicting 
species and communities along a gradient of mon- 
tane meadow communities (spectrally distinct re- 
gions labeled M1-M7) ranging from extremely hy- 
dric wetlands (M1 meadows) to mesic, forb-rich 
meadows (M3 meadows) to xeric, sagebrush flats 
(M6 meadows), and to sites where bare ground is the 
predominant cover (M7). 

The flora and fauna of these meadows were stud- 
ied within two regions of the Greater Yellowstone 
Ecosystem: Grand Teton National Park and the 
Bridger-Teton National Forest of northwestern 
Wyoming (hereafter termed “Tetons”) and the Gal- 
latin National Forest and Yellowstone National Park 
in Montana (hereafter termed “Gallatins”). The tax- 
onomic groups we investigated include birds, butter- 


flies, and plants. Our assumptions were that the 
landscape was relatively pristine, and thus commu- 
nity structure should have been more predictable 
than in a recently altered habitat (e.g., Hanski et al. 
1996b), and that species-habitat relationships could 
be predicted from abiotic factors such as a moisture 
gradient. Further, less than 1 percent of the species 
examined had range edges within our sampling 
area. Thus, if the habitat was present, the species 
could have potentially been present at that site. We 
limited our research to lower-elevation montane 
meadows (2,000-2,500 meters) to reduce the num- 
ber of variables affecting the predictability of species 
distributions. 

Our objectives were (1) to use remotely sensed im- 
agery to map different montane meadow communi- 
ties and develop spectrally based spatially explicit 
models for predicting plant and animal species 
distribution patterns in montane meadows, and (2) 
to test the predictive models in two areas of the 
ecosystem. We investigated the potential of remote 
sensing to map and predict montane meadows in 
general, and we examined specific subsets of these 
meadow types, such as montane wetland (Kindscher 
et al. 1998) and sagebrush (Jakubauskas et al. 1998) 
vegetation communities. We also examined the asso- 
ciations of specific animal species with our meadow 
types. In this chapter, we present a synthesis of these 
approaches. 
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Focal Taxa 


Plants can be viewed both as a component of species 
diversity and as a component of habitat diversity. The 
presence of a particular plant species in a specific site 
can be highly indicative of the microhabitat of that 
site. Because vegetation is a major contributor to spec- 
tral reflectance patterns measured by satellite sensors, 
we believe that it is imperative that we test the rela- 
tionship between remotely sensed meadow types and 
plant communities. If plant species distribution pat- 
terns can be predicted using remotely sensed data, re- 
lationships between remotely sensed data and animal 
taxa would be more probable. Thus, a plant survey is 
the critical link between remotely sensed data, habitat, 
and other species distribution patterns. 

Butterflies are suitable species for testing the hy- 
pothesis that remotely sensed data can be used to pre- 
dict species distributions because they are moderately 
to highly host-specific herbivores, and their diversity 
may be correlated with underlying plant diversity. But- 
terflies are also well known taxonomically and can be 
reliably identified in the field (Kremen 1992). Over 
one hundred species reside in the Greater Yellowstone 
Ecosystem (Bowser 1988; Brussard 1989). Birds are 
suitable because they are ecologically diverse and use a 
wide variety of food and other resources and reflect 
the condition of many aspects of the ecosystem. Birds 
often respond to spatial and temporal variation in a 
species-specific fashion (Wiens and Rotenberry 1981b; 
Steele et al. 1984; Taper et al. 1995). Birds are also 
conspicuous, ubiquitous, intensively studied, and 
often appear to be more sensitive to environmental 
changes than other vertebrates (Morrison 1986). Over 
two hundred different bird species reside in the Yel- 
lowstone National Park (McEneaney 1988), and most 
of these species also can be found in Grand Teton Na- 
tional Park. Our research focused primarily on song- 
birds and woodpeckers. 


Field Sampling and Data Analysis 


As a way of explaining our multiple projects, we've 
summarized in Table 44.1 meadow type, sample size, 
and region examined for each analysis. We began our 
study by defining meadow spectral classification, 


moved to specific subsets of montane meadows (sage- 
brush communities and wetland assessment), and then 
addressed biodiversity data. In our research, we tested 
for associations of specific animal species with our 
meadow types and finally analyzed the community 
similarity between the two regions of the ecosystem. 


Meadow Classification 


Computer analysis of multispectral satellite imagery 
was used to create maps of spectrally distinct montane 
meadows within the study regions. The maps were 
used to identify different meadow types and to guide 
plant and animal field sampling. Indian IRS LISS-II 
imagery from summer 1995 was used to produce 
maps for the 1996 sampling (Kindscher et al. 1998; 
Jakubauskas et al. 1998). Image data from the French 
SPOT (Systéme Pour l'Observation de la Terre) satel- 
lite for June and September 1996 were used to pro- 
duce maps for the 1997 and 1998 fieldwork. Both the 
IRS and the SPOT systems are multispectral scanners 
that acquire data in green, red, and near-infrared 
bands of the electromagnetic spectrum, with spatial 


TABLE 44.1. 


Analysis type, regions included, meadow types, sample sizes, 
taxa, and years examined for each analysis described in this 
paper. M1-M7 meadows represent spectrally distinct montane 
meadows (M1- hydric, M6 and M7 = xeric) within Grand Teton 
National Park and the Bridger-Teton National Forest of 
northwestern Wyoming (Tetons), and the Gallatin National 
Forest and Yellowstone National Park (Gallatins). 


No. Reps/ 
Analysis Region Mtype sites year 
1996 
Sagebrush 
communities Tetons M6 5d 1 
Wetland 
assessment Tetons M1-M3; M5-M7 30 il 
1997-1998 
Bird species Gallatins M1-M6 30 3 
Tetons M1-M3; M5-M6 25 3 
Butterfly Gallatins M1-M6 30 4 
Species Tetons M1-M3; M5-M6 25 4 
Plant species Gallatins M1-M6 30 1 
Tetons M1-M3; M5-M6 25 dl 
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resolutions of 36.5 meters (IRS) and 20 meters 
(SPOT). Each image was georeferenced to a Universal 
Transverse Mercator (UTM) coordinate system. An 
unsupervised classification procedure was used to pro- 
duce a map of spectrally distinct meadow types repre- 
senting a hydric (M1) to xeric (M6) gradient, (and M7 
for bare ground and talus). In order to reduce the 
number of variables affecting species distribution, we 
focused on low elevation sites (2,000-2,500 meters) 
with minimal slope (less than 30 degrees). To facilitate 
location of study sites during fieldwork, maps were 
plotted on translucent Mylar sheets, allowing overlay 
onto 1:24,000-scale USGS topographic maps of the 
study region. 


Sagebrush Community Classification 


In order to explore the limits of using multispectral, 
multi-temporal satellite imagery to identify and map 
specific community types, we carried out a second 
mapping procedure focused specifically on the sage- 
brush flats of Grand Teton National Park, Wyoming 
(meadow type M6). Fifty-one field plots were sampled 
in 1996, equally distributed among the four sagebrush 
communities (low sagebrush [Artemesia arbuscula], 
mountain big sagebrush [Artemesia tridentata ssp. 
vaseyana], mixed low sagebrush/big sagebrush, and 
bitterbrush [Purshia tridentata|/big sagebrush) as de- 
fined by Sabinske and Knight (1978). Shrub cover and 
height by species were surveyed as per Knight (1978); 
percent cover by forbs, grasses, shrubs, live stems, and 
mosses/lichens were recorded using the Daubenmire 
technique (Daubenmire 1959) within twenty 0.5x0.5- 
meter quadrats. An agglomerative hierarchical itera- 
tive clustering algorithm was used to assign each of 
the fifty-one sites to one of the four Sabinske and 
Knight (1978) community types (Jakubauskas et al. 
1998). Variables used in the cluster analysis were 
height and percent shrub cover by big sagebrush, low 
sagebrush, and bitterbrush; and percent cover by five 
general understory classes (grasses, forbs, litter, per- 
sistent litter, and rock/soil cover). Multi-temporal 
satellite imagery was combined into a single image file, 
and groups of spectrally similar pixels were identified 
using an unsupervised classification clustering proce- 
dure. Data from the fifty-one field sites sampled in 
1996 were used to identify the probable vegetation 


community represented by each cluster of spectrally 
similar pixels, producing a map of the four sagebrush 
communities described above. Accuracy assessment 
was carried out by comparing our 1996 classifications 
to points classified by Sabinske (1972) within the four 
sagebrush classes. 


Wetland Assessment 


Analysis of wetland classification was conducted 
across all meadow types in the Tetons (M1-M3, 
M5-M7) to determine whether our M1 and M2 
meadows showed up distinctly as *wetlands" based 
upon an index of average wetland value. This index is 
a metric used by researchers studying wetland vegeta- 
tion dynamics (Atkinson et al. 1993; Wilson and 
Mitsch 1996) and is based on the classification of all 
wetland plant species into one of five wetland classes 
(Reed 1988). Plant species are assigned a score from 1 
to 5, depending on their wetland affinity. The wetland 
affinity value for each species is multiplied by the per- 
cent cover of each species at a sample site, yielding a 
numeric score (1.00—5.00), where values closer to 
1.00 have more cover by obligate wetland vegetation 
and those closer to 5.00 have more cover of upland 
plant species. As an indicator of wetlands, any value 
under 3.00 indicates that the area has a wetland iden- 
tity. We surveyed thirty sampling sites in Grand Teton 
National Park, five of each of six meadow types. Veg- 
etation was sampled in a 20x20-meter plot for canopy 
cover (Daubenmire 1959) of all species during 1996 
and an index of wetland value was computed for each 
sampling site (Kindscher et al. 1998). Data were ana- 
lyzed using a nonparametric Kruskal-Wallis test in the 
SPSS/PC+ software package (SPSS 1988). 


Biodiversity Assessment 


A biodiversity assessment across all montane meadow 
types (M1-M6) was conducted for birds, butterflies, 
and plants in two regions of the ecosystem: the Tetons 
and the Gallatins. Our intention was to build models 
of species-habitat relations in one region of the ecosys- 
tem (e.g., Tetons) and then test them in the other re- 
gion of the ecosystem (e.g., Gallatins). Field data were 
collected in 1997 and 1998 at five sites in each of the 
meadow classes (twenty-five sites in the Tetons, thirty 
sites in the Gallatins). All sites were a minimum of 
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100x100 meters in size, a distance of 500 meters or 
farther from other sites, and within 8 kilometers of a 
road or trail. Vegetation data were collected as de- 
scribed in the wetland section, but in a 100x100-meter 
plot. Abundance data were collected for butterflies 
building upon previously developed methods by De- 
binski and Brussard (1992). Taxonomy followed Scott 
(1986). Butterflies were surveyed between 1000 and 
1630 hours on sunny days by netting and releasing for 
twenty minutes in a randomly selected 50x50-meter 
plot within the 100x100-meter sampling site. Surveys 
were repeated at each sampling site four times in each 
of the two years. Abundance data were collected for 
birds using 50-meter-radius point-count surveys. Sur- 
veys were performed three times at each site per sea- 
son during 0530-1030 hours and between 1 June and 
17 July. Each survey involved two observers for fifteen 
minutes. Landscape data, including percent willow 
cover, percent sagebrush cover, meadow size, distance 
to meadow of the same type, distance to treeline, tree 
density, vegetation biomass, and leaf area index, were 
also collected for examining relationships between 
birds and habitat (Saveraid 1999). 

The species-habitat relationships were analyzed 
using a variety of multivariate statistical techniques. In 
order to assess the level of species detection, we first 
constructed a species accumulation curve for each 
taxon in each landscape by year using PC ORD 
(McCune and Mefford 1997) and estimated total 
species richness for each landscape using a first-order 
jackknife estimator (Heltshe and Forrester 1983; 
Palmer 1992). An important characteristic of this type 
of community data is that the number of observations 
(sampled sites) is small with respect to the number of 
variables (plant, bird, or butterfly species). A consider- 
able number of species were represented by less than 
ten individuals and were dropped from the data sets. 
This approach restricted the analysis to those species 
with abundance measures large enough for rigorous 
statistical analysis. Multiple regression models were 
developed for eleven bird species using both meadow 
type and landscape variables (Saveraid 1999). Classifi- 
cation and regression tree analysis (CART; Steinberg 
and Colla 1997) was used to examine butterfly species 
relationships with meadow types. The latter analysis 
does not allow one to predict species presence based 


upon meadow type; in effect, our data (species distri- 
butions and abundances) constrain us to predict 
meadow type (a discrete category) from species abun- 
dance data. However, the results do give information 
regarding which species show affinities for specific 
meadow types. For example, the abundance of species 
X may differentiate the M5 meadows from the 
M6 meadows, and CART gives an importance score 
to species X and abundance value that allows for that 
split. In all of these analyses, each year, region, and 


taxa were tested separately. 


Similarity between Communities 


Differences between communities in the two regions 
of the ecosystem were examined using a Bray-Curtis 
measure of community distance (Bray and Curtis 
1957). Bray-Curtis measures were calculated within 
each meadow type (excluding M4) for each taxon dur- 
ing 1997 and 1998 using each temporal replicate as a 
subsample and pooling all sites from the Gallatins to 
be compared with all sites from the Tetons. Tests for 
significant differences in community distance were 
conducted using an MDS on Bray-Curtis Distance 
(SAS Institute 1990b). 


Results 


Our results are divided into three categories: (1) classi- 
fication of meadows into wetland or sagebrush com- 
munity types, (2) tests for association of specific 
animal species with our meadow types, and (3) analy- 
sis of similarity between the two regions of the ecosys- 
tem. We should preface these comments by noting that 
we felt relatively confident that our analyses were 
based on a large proportion of the total species pres- 
ent in the area. The number of bird and butterfly 
species observed relative to the estimated total species 
richness (first-order jackknife estimator) in 1997 and 
1998 showed that our sampling efficiency averaged 
just over 80 percent for butterflies and plants and 74 
percent for birds (Table 44.2). The birds in the Gal- 
latin landscape showed the lowest percentage of ob- 
served species relative to the total predicted (69 per- 
cent) but nonetheless represented the majority of the 
species in the landscape. 
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TABLE 44.2. 


Species accumulation results for each taxon in each 
landscape by year. Total species richness was estimated for 
each landscape using a first-order jackknife estimator. 
Species accumulation curves were plotted across all meadow 
types using each sampling date as a separate replicate. For 
each year, there were three temporal replicates for birds, four 
temporal replicates for butterflies, and one temporal replicate 
for plants. 


% Species 
observed 
First-order relative to 
No. spp. jackknife number 
observed estimate predicted 
Gallatins 
Bird 1997 S 47.8 69 
Bird 1998 26 37.8 69 
Butterfly 1997 45 55.9 81 
Butterfly 1998 50 61.9 81 
Plant 1997 299 STO 3L 82 
Plant 1998 236 303.3 78 
Tetons : 
Bird 1997 40 48.9 82 
Bird 1998 39 619 75 
Butterfly 1997 59 70.9 83 
Butterfly 1998 59 74.8 79 
Plant 1997 25, 320.8 80 
Plant 1998 247 3075 80 


Sagebrush Community Classification 


Overall accuracy of our sagebrush community classifi- 
cation was 65 percent (thirty-three of fifty-one sites) 
and highest for the mixed big sagebrush/low sage- 
brush community at 86 percent (Jakubauskas et al. 
1998). Results indicate that the four sagebrush com- 
munities can be best differentiated using visible bands 
of reflected light recorded by a satellite sensor, inde- 
pendent of image date. Near-infrared reflectance was 
of value for distinguishing between sites only in late 
fall (October). Analysis of the accuracy assessment in- 
dicates that the pure big sagebrush and low sagebrush 
communities are consistently misclassified with the 
mixed big/low sagebrush community. Misclassification 
of the bitterbrush and big sagebrush communities as 
forb-dominated areas occurred primarily as a result of 
a fire east of Blacktail Butte in Grand Teton National 


Park in 1994. Both the bitterbrush and the big sage- 
brush communities were burned during this fire, 
which regenerated to forb and grass cover by the date 
of the satellite overpass. Accuracy for low sagebrush 
may be higher than warranted, as areas of sparse 
cover within mixed and big sagebrush communities 
were mapped as low sagebrush. In the eastern part of 
the study area, where this community type does not 
occur, areas mapped as low sagebrush most likely rep- 
resent areas of less-dense big sagebrush rather than 
low sagebrush. 


Wetland Assessment 


We were able to accurately classify 70 percent (seven 
out of ten) of the wetlands we studied into two wet- 
land groups (M1 and M2) and to predict the location 
of 1,258 hectares of M1 wetland meadows and 1,711 
hectares of M2 wetland meadows within Grand Teton 
National Park (Kindscher et al. 1998). One hundred 
and eighty-three plant species were found in the mead- 
ows surveyed, including ten obligate wetland species. 
Eight out of ten obligate wetland plant species had 
their greatest cover in M1 meadows and had signifi- 
cant cover differences among meadow types using the 
nonparametric Kruskal-Wallis test. Percent cover data 
showed that M1 and M2 meadows were dominated 
by obligate wetland vegetation, whereas other mea- 
dow types had less than 0.1 percent obligate wetland 
vegetation. Facultative wetland plant coverage showed 
similar trends. 


Biodiversity Assessment 


Eleven of the thirty-seven bird species observed were 
used to test predictability of species occurrence in both 
1997 and 1998. We compared the accuracy of predict- 
ing bird occurrence using both spectrally defined 
meadow types and landscape data (Saveraid 1999). 
The selection criteria for the birds used in the analysis 
was a total abundance greater than or equal to four- 
teen in each year of the study. Meadow type, as deter- 
mined from the satellite data, was significantly corre- 
lated with the abundance of six of the eleven bird 
species (R2 range of 0.275 to 0.49). The abundance of 
generalist species (American robin [Turdus migrato- 
rius], dark-eyed junco [Junco hyemalis], white- 
crowned sparrow [Zonotrichia leucophrys}, Brewer’s 
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blackbird [Euphagus cyanocephalus], and chipping 
sparrow [Spizella passerina]) was not strongly corre- 
lated with meadow type. We then tested the use of a 
combination of meadow type and landscape variables 
(e.g., meadow area, percent graminoid cover, percent 
willow cover, percent sagebrush cover, distance to 
treeline; see Saveraid 1999 for details) to increase the 
predictability of the models. Significant landscape 
variables were selected using a stepwise multiple re- 
gression for each bird species. These variables were 
then used in a multiple regression analysis with the 
classification variable meadow type. Ten out of the 
eleven bird species showed a significant correlation 
with one or more variables when both landscape vari- 
ables and meadow type were used in the models (R2 
range of 0.189 to 0.811). Abundances of the species 
commonly associated with hydric meadows (common 
snipe [Gallinago gallinago], common yellowthroat 
[Geothlypis trichas], Lincoln's sparrow [Melospiza lin- 
colnii], Savannah sparrow [Passerculus sandwich- 
ensis], and yellow warbler [Dendroica petechia}) were 
significantly correlated with meadow type and land- 
scape variables such as percent willow cover and per- 
cent woody vegetation. There were fewer species in 
the xeric meadows, but the most commonly observed 
species, the vesper sparrow (Pooecetes gramineus), 
was highly correlated with meadow type and percent 
sagebrush cover. 

Butterfly distributions were much more strongly as- 
sociated with specific meadow types in the Teton land- 
scape relative to the Gallatin landscape. Using species 
abundance data for each meadow type, classification 
tree analysis was used to identify species that were 
most important in distinguishing among meadow 
types. Of the sixty-seven butterfly species found dur- 
ing our surveys in the Tetons, twenty species in 1997 
and twenty-seven species in 1998 were used in the sta- 
tistical analysis. Fourteen species were defined as im- 
portant in distinguishing among meadow types, and 
these species could be used to classify sampling sites 
into one of five different meadow types with 96 
percent accuracy in 1998 and a 92 percent accuracy in 
1997. Six species showed high importance scores for 
both years in the Tetons. Each species is listed fol- 
lowed by the meadow type for which it showed a high 
affinity: Coenonympha orchracea (M6), Lycaena het- 


eronea (M6), Coenonympha haydenii (MS), Cercyo- 
nis oetus (M3), Speyeria mormonia (M3), and Boloria 
frigga (M2). Predictability of species was much lower 
(63 percent) in the Gallatins, where patch size is much 
smaller, but many of the same species showed high im- 
portance scores in differentiating the meadows (Bor- 


gognone 1999). 


Similarity between Communities 


Community differences between the Gallatin and 
Teton meadows were significantly larger than expected 
via random variation in the majority of the compar- 
isons examined (Table 44.3). Two of the five meadow 
types examined showed significant differences for birds 
and plants and four of the five meadow types showed 
significant differences for butterflies. We report results 
here from 1998; similar results were observed in 1997. 
Given these differences, we did not pursue testing mod- 
els developed in one region of the ecosystem on the 
species observed in the alternate region. 


Discussion 


Each of these sub-projects has taken a similar ap- 
proach to testing the predictability of communities 
and species patterns in the Greater Yellowstone 
Ecosystem. In each case, remotely sensed data were 
analyzed to create meadow classification maps that 
guided field surveys. Biotic communities were tested 
for their association with these remotely sensed 
meadow types. In general, predicting meadow type 
(and the major plant species associated with that 
meadow type) is somewhat easier than predicting a 
long list of specific species occurrences. This is not sur- 
prising given that meadow type is a more general clas- 
sification and comprises both plant communities and 
abiotic factors that play a role in determining what re- 
flectance patterns are measured by a satellite. 

When we create predictive models of species occur- 
rence, one of the major limitations is that rare species 
are often such a small component of the data set that 
they cannot be used to build predictive models (Hep- 
install et al., Chapter 53; Karl et al., Chapter 51). An 
inherent statistical problem exists when the number of 
variables (in this case species) describing a site is more 
numerous than the number of sampling sites. In both 
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TABLE 44.3. 


Bray Curtis distances for birds, butterflies, and plants in Gallatin versus Teton 
meadows, 1998. Five spatial replicates per meadow type were compared for 
differences in species composition. Distance value is noted above and the P value is 
noted below. Significant values (at P « .05) are noted in boldface type and indicate 
that between group distance is greater than within group distance. 


Taxon M1 M2 M3 - M5 M6 
Birds Distance 0.636 0.72 0.782 0.879 0.657 
P 0.01 0.09 0.04 0.46 (51i 
Butterflies Distance 0.715 0.569 0.489 0.488 0.463 
P 0.01 0.01 0.01 0.14 0.01 
Plants Distance 0.734 0.647 0.682 0.602 0.527 
P 0.31 0.07 0.18 0.01 0.01 


the bird and the butterfly community analyses, we 
ended up analyzing only a small fraction of the total 
number of species for their associations with specific 
meadow types. The species we used in our models, 
however, gave us relatively strong results. The species 
that were used in these models may not include all of 
the rare species, but they are important indicators of 
each of the meadow types examined in this research. 
A future step in this research arena might be to test 
how often rare species are associated with more com- 
mon "indicator" species. Interestingly, our ability to 
build models in one part of the ecosystem and test 
them in another part of the ecosystem was not suc- 
cessful. Even a distance of less than 200 kilometers 
showed significant differences between the Teton and 
Gallatin meadows with respect to the majority of the 
bird, butterfly, and plant communities. 

In building models of community composition, it is 
important to examine the effects of species ranges, 
niche widths, seasonality, sampling intensity, land- 
scape history, patch size, and landscape context. These 
variables may explain why we find a species in what 
could be considered unsuitable habitat, or alterna- 
tively why we might not find a species in what would 
be considered suitable habitat. Only one species 
(Coenonympha haydenii, the Hayden's ringlet butter- 
fly) of the three taxa we surveyed would be considered 
endemic to the region. There were no range edges that 
should have significantly affected species distribution 
patterns within the ecosystem. Predictability may be 
affected by niche width (Hepinstall et al., Chapter 53), 


and thus it may not be surprising that generalist birds 
showed different patterns than specialists in our study. 
We should note that many of our species predictions 
are seasonally limited. Because some of the bird 
species are migratory and many of the plant and but- 
terfly species are not visible during the winter, these 
predictions are intended for the summer season only. 

Our data concur with Karl et al. (Chapter 51) that 
intense sampling is necessary to develop predictive 
models of species-habitat relationships. Historical 
factors (e.g., fire in the sagebrush communities) may 
also affect patterns of species distribution. Two vari- 
ables affecting predictability that we are just begin- 
ning to examine include patch size and landscape 
context. If habitat is present but present in small 
patches relative to the home range of an organism, 
the predictability of species-habitat relationships may 
be less reliable. Similarly, landscape context may 
complicate the predictability of species occurrence 
models, and community spillover (Holt 1997) from 
one patch to another may confound species-habitat 
relationships. We suspect that patch-size effects and 
spillover may have contributed to the lower pre- 
dictability of butterfly species distributions in the Gal- 
latin landscape. Several of the meadow types in the 
Gallatins are much smaller on average than their 
counterparts in the Tetons, even by a magnitude in 
size (Debinski unpublished data). 

The methods described here have management appli- 
cations from a short-term as well as a long-term 
planning perspective. In the short term, agencies such as 
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The Nature Conservancy are using these types of tech- 
niques in their ecoregional planning (The Nature Con- 
servancy 2000). Satellite data may increase our effi- 
ciency in locating areas of high biodiversity or with 
high probabilities of supporting rare species. From a 
longer-term planning perspective, one of the more inter- 
esting applications of these models will be that of mon- 
itoring communities for effects of global climate 
change. Detection and characterization of changes in 
vegetation extent and condition at multiple spatial and 
temporal scales provides important information about 
the variability within a habitat patch. Recent advances 
in remote sensing technology and theory have expanded 
opportunities to characterize the seasonal and interan- 
nual dynamics of vegetation communities. Analysis of 
changing spectral patterns can provide precursor meas- 
urements of terrestrial ecosystem dynamics (Waring et 
al. 1986; Ustin et al. 1993; Lancaster et al. 1996). The 
temporal domain of multispectral data frequently pro- 
vides more information about land cover and condition 
than do the spatial, spectral, or radiometric domains 
(Kremer and Running 1993; Eastman and Fulk 1993; 
Samson 1993; Reed et al. 1994; Jensen et al. 1993). 
Quantification of this inherent landscape variability at a 
regional scale using remotely sensed data, in concert 
with predictive models of community type and species 
distributions, will provide the foundation for modeling 
the effects of environmental change and consequent ef- 
fects upon plant and animal biodiversity. 

Our long-term goal is to develop a suite of re- 
motely sensed and field-based indices of community 
identity and species composition that will serve as in- 
dicators of environmental change. Montane commu- 
nities are likely to be some of the most susceptible to 
climate change (Peters and Lovejoy 1992). Changes 
in hydrology and plant species composition, them- 
selves driven by seasonal and interannual changes in 
climate, determine cover and vigor of individual plant 
species and their availability and use by animals. Nat- 
urally occurring plant communities occupy specific 
geographic sites based on narrowly defined adapta- 
tions to gradients of temperature and moisture. 
Short-term changes in environmental conditions are 
manifested as changes in vegetation condition, while 
long-term, directional shifts in temperature and mois- 
ture regimes drive changes in species composition and 


diversity (e.g., Harte and Shaw 1995). Although the 
experts are still debating whether montane ecosys- 
tems will become more wet or more dry, there is vir- 
tually no disagreement that the earth's surface will 
become warmer (Schneider 1993). Because the mead- 
ows we have surveyed are arranged along a moisture 
gradient, we expect to be able to detect changes in the 
landscape and associated communities at either end 
of the spectrum. 


Conclusions 


The results from our study indicate that satellite im- 
agery is applicable for mapping wetland and sage- 
brush communities and for predicting dominant veg- 
etation in each of these communities with a high 
level of accuracy. We were also able to show signifi- 
cant correlation between specific meadow types and 
the distribution patterns of a subset of the bird and 
butterfly species in montane meadow types. Butter- 
flies showed stronger relationships with meadow 
type than birds did, especially in the Teton land- 
scape. Because birds respond to habitat structure, 
adding landscape variables such as percent cover of 
the dominant woody vegetation increased the pre- 
dictability of our models. 
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odels that reliably quantify relationships be- 

tween environmental variables, species occur- 
rences, and population dynamics can enhance our ca- 
pacity for land-use planning substantially. Attempting 
to develop such models is critical, for managers simul- 
taneously face daunting obligations and stringent re- 
source constraints (Stohlgren et al. 1995; Oliver and 
Beattie 1996; Longino and Colwell 1997; Niemi et al. 
1997; Simberloff 1998; Maurer, Chapter 9). Analytic 
and predictive models of species richness, occurrence, 
and viability may assist practitioners in meeting nu- 
merous objectives, including conservation of relatively 
rich or unusual species assemblages and protection or 
eradication of individual species (or other taxonomic 
levels). If a subset of the significant predictive vari- 
ables can be modified by human activities, then the 
models also may be used to forecast the biological ef- 
fects of alternative management strategies. 

Species of concern often include native plants and 
animals that are threatened by anthropogenic or natu- 
ral factors. Introduced or invasive species represent a 
second focal group. Yet other species become the tar- 
get of planning efforts because their measurement is 
believed to provide a scientifically reliable and cost- 
effective surrogate measure of a distinct ecological pa- 
rameter that is difficult to assess directly. Indicator 
species, for example, exhibit distributions, abun- 
dances, or population dynamics that can serve as sub- 


stitute measures of the status of other species or envi- 
ronmental attributes (Noss 1990b). Potential umbrella 
species, taxa whose conservation confers a protective 
umbrella to numerous cooccurring species, and key- 
stone species, taxa with a disproportionate impact on 
the dynamics of their ecosystems, also have attracted 
considerable attention from biologists and practition- 
ers (Simberloff 1998). Because few organisms have 
proved to be dependable and affordable gauges of 
variables that characterize communities or ecosystems 
(Scott 1998), we urge ecologists to test the efficacy of 
potential surrogate species rigorously before employ- 
ing them for on-the-ground management. 

Because the status of and threats to a given species 
may depend upon the spatial extent at which those 
phenomena are considered, models ideally should be 
able to address factors that influence species distribu- 
tions and viability at multiple spatial extents. The bal- 
ance between benign neglect and substantive human 
intervention that best will protect a certain assemblage 
or species likewise may fluctuate in space and time 
(New et al. 1995; Thomas and Hanski 1997). 

We recently introduced a modeling approach that 
can be utilized at different grains and extents to exam- 
ine species richness in relation to environmental vari- 
ables and to recognize independent variables that in- 
fluence the occurrence and population dynamics of 
selected species (Fleishman et al. 2000). The frame- 
work initially was developed to identify species that 
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might be used as indicators of high species richness. In 
this chapter, we apply the approach in the context of 
regional land-use planning and management. As pre- 
sented here, our model has three steps. First, we use 
multiple linear regression to predict species richness as 
a function of easily quantified environmental variables 
(e.g., derived from geographic information systems 
[GIS]). The validity of the species richness model is 
evaluated by calculating the percentage of variance in 
species richness that it explains and the probability 
that it correctly classifies locations with respect to cho- 
sen levels of species richness. These tests are per- 
formed not only upon the data used to build the 
model but also upon independent sets of data. Second, 
we employ logistic regression to identify easily quanti- 
fied environmental variables that help explain occu- 
pancy patterns of individual species over relatively 
large areas. Third, we use data on species occurrences 
and local resources to develop a spatially realistic pop- 
ulation model that generates testable probabilities of 
persistence (see Hanski and Simberloff 1997). Our 
modeling approach is a general procedure that can be 
used for diverse taxonomic groups and ecoregions. 
Here we present an overview of the approach and il- 
lustrate its application with case studies of butterflies 
in the Great Basin. 


Modeling Framework 


Land-use planning benefits from a general understand- 
ing of which locations have relatively high or low con- 
centrations of native species and where individual 
species are likely to occur. Clearly, comprehensive field 
inventories address these issues directly, and empirical 
data are essential for developing effective models and 
management strategies. Nonetheless, tried-and-proven 
predictive models have tremendous value. For exam- 
ple, models sometimes can be used to assign tentative 
species richness or occurrence values to planning units 
based on efficiently derived biophysical variables 
(Angermeier and Winston 1999; Mac Nally et al. 
2002; Fleishman et al. 2001). These projections may 
help prioritize groundtruthing and more-detailed 
field studies. Knowledge of circumstances in which 
predictive ability tends to be low can be helpful too— 
empirical data collection may be most critical in those 


situations. Moreover, even when ample inventory data 
exist, predictive models may help practitioners weigh 
the potential benefits and risks of alternative manage- 
ment approaches. 

Environmental variables that may affect species dis- 
tributions across a landscape include topography, cli- 
mate, frequency and magnitude of disturbance events, 
and patterns of human land use. Over large areas, 
multiple populations of some species, especially those 
regulated largely by density-independent factors, tend 
to fluctuate in synchrony (Pollard 1991). However, 
long-term trends in occupancy or abundance at a par- 
ticular location sometimes deviate from regional 
trends. Models may help elucidate the extent to which 
divergent population dynamics are functions of local 
environmental characteristics or human influences. In 
a similar vein, habitat-based models may clarify 
whether correlates of occupancy or viability differ 
across a species’ range (Thomas et al. 1998; Maurer, 
Chapter 9). 


Predicting Species Richness 


The first step of our modeling framework uses multi- 
ple linear regression to predict species richness across 
a landscape as a function of easily derived environ- 
mental variables. We assume that relatively complete 
lists of resident species have been compiled, using 
standard methods (e.g., Pollard and Yates 1993; 
Heyer et al. 1994; Wilson et al. 1996), for a represen- 
tative sample of locations to be managed in the future 
(i.e., spanning the range of major environmental gra- 
dients and/or vegetational communities). Is this as- 
sumption realistic? We acknowledge that inventory 
data for many landscapes are sparse, but we argue 
that these data are prerequisite for developing produc- 
tive models. 

Predictive variables entered into the species richness 
model should be tractable to measure, resistant to ob- 
server bias, and should be expected to reasonably influ- 
ence species distributions in the focal assemblage. 
Many relevant variables, such as elevation or aspect, 
can be derived from existing electronic sources of data 
such as digital elevation models (DEMs) and digital line 
graphs (DLGs). Various DEM and DLG coverages of 
the United States are currently available for no charge 
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from the U.S. Geological Survey's EROS Data Center, 
http://edc.usgs.gov/doc/edchome/ndcdb/ndcdb.html. 
Other variables, including climate parameters like tem- 
perature or precipitation, are relatively simple to quan- 
tify at numerous points in space and time via field meas- 
urements, remote data loggers, or realistic models (e.g., 
the PRISM family of orographic climate models, Daly et 
al. 1994, 1997). If possible, a subset of the independent 
variables should be responsive to management. 

To qualify as valid and practical, a predictive model 
of species richness should explain a significant per- 
centage of variation in the number of species. The 
model also should correctly classify locations with re- 
spect to species richness. In other words, the model 
should accurately identify locations with a species 
richness greater than a certain value and should not 
erroneously predict that other locations meet that cri- 
terion. Because using a model that is not demonstrably 
successful offers no advantages for land-use planning, 
its validity should be assessed not only with the data 
used to build the model, but also with independent 
sets of data (Fielding, Chapter 21). If managers are 
most interested in predicting how species richness in a 
single system will respond to human activities or eco- 
logical change, then existing data may be divided into 
two separate sets for model building and model vali- 
dation. Alternatively, data from later time steps may 
be used for validation. If predicting species richness 
patterns in ecologically similar but poorly inventoried 
systems is a higher priority, then data from new loca- 
tions can be used to test the model's accuracy. 


Predicting Species Occurrence 


Species richness is one consideration in delineating 
land uses. In many cases, the distributional patterns of 
certain species also help managers evaluate locations 
for activities ranging from resource extraction to 
treatment with herbicides or prescribed fire (Van 
Horne, Chapter 4). Predictive models of species occur- 
rence, like predictive models of species richness, not 
only are valuable tools in the absence of complete in- 
ventory data, but also may clarify whether environ- 
. mental changes will affect the probability that a target 
species will inhabit a specified location. 

Therefore, the second step of our modeling ap- 


proach tests whether easily quantified environmental 
variables explain significant variation in the occur- 
rence patterns of individual species. In this phase, we 
use logistic regression to analyze simultaneously corre- 
lations between each of a suite of predictive variables 
and the distribution (presence/absence) of a species of 
interest. Models that are based upon correlations do 
not necessarily identify causes of ecological patterns. 
Nonetheless, they are useful for planning purposes be- 
cause they link landscape variables with species distri- 
butions and.do not require overwhelming quantities 
of field data. Furthermore, it often is possible to draw 
strong biological inferences about why certain vari- 
ables have significant analytic or predictive ability. 


Predicting Persistence 


Especially in a dynamic system, managers may want 
to know not only whether a species is likely to be 
present in a given location, but also what ecological 
conditions might promote its persistence. Identifying 
variables that are associated with turnover (coloniza- 
tion and extinction) is especially helpful if manage- 
ment options include restoration of habitat and/or 
reintroduction. It may be possible to test whether 
turnover events are correlated with changes in envi- 
ronmental parameters that are affected by human ac- 
tivities and whether management can minimize varia- 
tion in some of those factors. Detailed population 
modeling is usually data-intensive and therefore re- 
stricted to species of considerable management con- 
cern. Fortunately, the logistic investments necessary to 
obtain spatially explicit data on dependent and inde- 
pendent variables (often key resources such as nesting 
sites or food base) can yield high returns. If validation 
tests demonstrate that models parameterized with em- 
pirical data accurately predict turnover events, then it 
also may be possible to generate realistic forecasts 
using projections of environmental change. 
Habitat-based occupancy and viability models are 
applicable to species with diverse population struc- 
tures. Knowledge of the population structure of 
the target species will help guide selection of predictive 
variables. In species with open populations, for 
example, individuals usually are relatively mobile. 
Therefore, assessment of relatively coarse-grained 
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environmental attributes is likely most relevant (New 
et al. 1995). In species that have closed populations, 
by contrast, local habitat features or resources (e.g., 
vegetation composition and structure, water chem- 
istry) may have a more substantial bearing on demo- 
graphic parameters such as birth and death rates. 

Some other species persist as metapopulations, sets 
of local populations that to some degree are linked by 
dispersal (Hanski and Gilpin 1997; Hanski 1999). Al- 
though each patch of habitat in a metapopulation can 
support a local breeding population, no single popula- 
tion is adequate to ensure the long-term viability of 
the entire metapopulation (Hanski et al. 1995). Mea- 
surement of a suite of environmental variables that 
collectively characterize a range of spatial grains may 
be necessary to accurately model habitat requirements 
for metapopulations. Local habitat features tend to af- 
fect demographic parameters within each patch; 
larger-area attributes such as isolation or climate often 
affect regional demographic trends and population dy- 
namics (Hanski and Gilpin 1997). Conservation of a 
metapopulation demands that multiple patches of 
habitat be maintained, but assessing the importance of 
each patch is complicated by the fact that not all suit- 
able patches are occupied at any given time. Thus, 
metapopulation models serve at least two purposes. 
First, they can distinguish between locations that are 
suitable but unoccupied and locations that probably 
are not suitable for the focal species. Second, they can 
identify environmental correlates of extinction and 
colonization within networks of suitable habitat 
patches. 

There appears to be a trend to classify any spatially 
structured population as a metapopulation. Although 
logistic regression models of population dynamics can 
be applied to species with a broad range of population 
structures, we caution against automatically assuming 
that any “patchy” population functions as a metapop- 
ulation (Hanski and Simberloff 1997; Brommer and 
Fred 1999). If distinct populations are not interde- 
pendent, then it probably is more appropriate to man- 
age each one separately rather than as members of a 
regional system in which local extinction, coloniza- 
tion, and dispersal are fundamental considerations. 

Which spatially realistic model is likely to be most 
effective depends in part upon the population dynam- 


ics of the target species. For example, incidence func- 
tion models are useful for management purposes be- 
cause they can be parameterized with data on species 
occupancy from a single time step, and they yield lo- 
cation-specific extinction and colonization probabili- 
ties (Hanski et al. 1996b). However, incidence func- 
tion models assume that species occurrence is 
equilibrial at the regional level (Hanski 1994a,b; Han- 
ski et al. 1996b; Sjógren-Gulve and Hanski 2000). Re- 
peated field inventories are essential for evaluating 
whether these assumptions are met. If the proportion 
of locations occupied by the target species changes sig- 
nificantly over time, it is more appropriate to model 
extinction and colonization patterns with logistic re- 
gression (Sjógren-Gulve and Hanski 2000). 


Case Study 


The Great Basin of western North America includes 
nearly 430,000 square kilometers of internal drainage 
extending from the east slope of the Sierra Nevada 
and southern Cascades to the west, the west slope of 
the Wasatch Range to the east, the Columbia River to 
the north, and the Colorado River to the south 
(Grayson 1993). More than 75 percent of the ecore- 
gion is federally owned and managed under multiple- 
use mandates. Managers face considerable resource 
constraints; even the most fundamental data on 
species distributions frequently are not available. 
Topographically, the Great Basin is dominated by 
more than two hundred mountain ranges. After the 
Pleistocene, these ranges were isolated from the sur- 
rounding valleys as the regional climate became 
warmer and drier (Brown 1978; Grayson 1993). Indi- 
vidual mountain ranges, and the canyons that deeply 
incise many of them, essentially function as islands of 
habitat for numerous taxa that either are restricted to 
montane vegetation types or have relatively low mo- 
bility, including butterflies (McDonald and Brown 
1992; Murphy and Weiss 1992). Federal oversight 
agencies generally develop management plans on a 
range-by-range basis. Within mountain ranges, land 
uses commonly are delineated at the level of individual 
or several adjacent canyons. 
We chose to focus on butterflies for several reasons. 
They are well understood biologically, are fairly 
easy to study and monitor, and have relatively short 
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generation times, thus possibly exhibit rapid re- 
sponses to management (e.g., Ehrlich and Davidson 
1960; Pollard 1977; Scott 1986; New 1991; Kremen 
1992; Pollard and Yates 1993; Harding et al. 1995; 
Shapiro 1996). In addition, the presence of some but- 
terflies has been proposed to convey information on 
other taxonomic groups or ecosystem attributes of 
concern (e.g., Pyle et al. 1981; Erhardt 1985; Ehrlich 
and Murphy 1987; Eyre and Rushton 1989; Morris 
et al. 1989; Viejo et al. 1989; Hafernik 1992; Pren- 
dergast et al. 1993; Kremen 1994; Nelson and Ander- 
sen 1994; Holl 1995; Thomas 1995; Blair and Launer 
1997). 

From 1994 through 1996, we conducted compre- 
hensive inventories of butterflies in nineteen canyons 
in the Toiyabe Range, a large mountain range in the 
central Great Basin. We used walking transects, an es- 
tablished technique that reliably detects species pres- 
ence (Kremen 1992; Pollard and Yates 1993; Harding 
et al. 1995). It is unlikely that we failed to detect 
species that actually were present in a given location. 
Field personnel were familiar with the regional butter- 
fly fauna, and we restricted our inventories to times 
when the weather was most favorable for flight 
(Shapiro 1975; Pollard 1977; Swengel 1990; Thomas 
and Mallorie 1985; Pollard and Yates 1993). It is rea- 
sonable to interpret that a given butterfly species is ab- 
sent if the area has been searched with these methods 
during the appropriate season and weather conditions 
(Pullin 1995; Reed 1996). During the study period, we 
recorded sixty-eight resident species (those that com- 
plete their entire life cycle in the Toiyabe Range), 98 
percent of the number expected under a Michaelis- 
Menten model (Clench 1979; Raguso and Llorente- 
Bousquets 1990; Soberón and Llorente 1993). 


Species Richness of Butterflies 


Identifying and predicting where native species rich- 
ness is relatively high or low holds considerable value 
for development of management strategies. Using our 
data on butterflies in the Toiyabe Range as a case 
study, we used multiple linear regression to build a 
predictive model of species richness as a function of 
GIS-derived environmental variables. Managers could 
potentially employ this model in planning efforts not 


only for the Toiyabe Range, but also for ecologically 
similar mountain ranges in the Great Basin. 

Our regression analysis was based on inventory data 
and environmental variables for sixty-eight canyon 
segments, each extending for approximately one hun- 
dred vertical meters, in fifteen Toiyabe Range canyons 
(see Fleishman et al. 1998). Data from an additional 
thirty-four canyon segments were used to validate the 
regression model. Species richness of these 102 canyon 
segments ranged from seven to fifty-three. We included 
forty-eight environmental variables in our analysis 
(Table 45.1). All are either static or amenable to re- 
peated measurement and were relatively easy to quan- 
tify. To obtain values, we first recorded the endpoints 
of each canyon segment with differentially corrected 
global positioning systems (GPS). We overlaid the GPS 
points on a thirty-meter DEM and buffered each 
canyon segment to a width of 100 meters. Next, we 
used ArcView and ArcInfo (GIS software packages, 
Environmental Systems Research Institute, Redlands, 
Calif., http://www.esri.com) and a small suite of exist- 
ing algorithms (e.g., Dubayah and Rich 1996) to derive 
the environmental variables. Values for this set of inde- 
pendent variables easily can be obtained for locations 
that have not yet been inventoried. 

Our analysis of butterfly species richness in the 
Toiyabe Range was highly statistically significant 
(Fleishman et al. 2000). The full model, S = 7129.78 + 
5.476 Llength — 0.1246 NORMEAN + 0.1382 NOR- 
MAX - 0.0032 EQINMAX + 7.084 Legqinstd + 
0.00892 SSINMAX - 4.608 SSHRMAX - 0.285 
EX15MEAN, explained a considerable percentage (77 
percent) of the variation in species richness (see Table 
45.1 for a complete explanation of the independent 
variables). Species richness of a canyon segment 
tended to increase with increasing area and with in- 
creasing heterogeneity in aspect and solar insolation. 
These correlations are not surprising given the life his- 
tory of many butterfly species. For example, area may 
be correlated with diversity of larval host plants and 
adult nectar sources and with topographic heterogene- 
ity. Variation in aspect and solar insolation may func- 
tion as surrogate measures of overall topographic di- 
versity. Not only is topographic heterogeneity often 
associated with vegetational diversity, but also the 
adults of many butterfly species locate mates on 


TABLE 45.1. 


Environmental variables for each canyon segment included in step one of the case study, how each was derived, and the data and 
software needed for their derivation.? 


Data and 

Variable Definition software needs? 
EASTMEAN Mean “eastness” on a scale from —100 (west-facing) to 100 (east-facing). Derivation for 4l. 22. el 

each cell: 100*sine(aspect in degrees) 
EASTMIN Minimum eastness 1,2,3 
EASTMAX Maximum eastness d, 249 
EASTSTD Standard deviation of eastness 12 
NORMEAN Mean “northness” on a scale from —100 (south-facing) to 100 (north-facing). Derivation Taza 

for each cell: 100*cosine (aspect in degrees) 
NORMIN Minimum northness 152203 
NORMAX Maximum northness 22 
NORSTD Standard deviation of northness 172 9 
Lelevmea Mean elevation in meters (In-transformed). Derived directly from DEM cell values i, 21 8 
Lelevmin Minimum elevation (In-transformed) di By & 
Lelevmax Maximum elevation (In-transformed) aD ES 
Lelevstd Standard deviation of elevation(In-transformed) 1 253 
Lelevmid Midpoint elevation (In-transformed) i, 22, &} 
Lgrad Gradient (In-transformed). Derivation: 57.3*arctan[(maximum elevation — minimum elevation)/length] 1, 2, 3 
LNA Surface area (In-transformed). Derivation for each cell: cell size/cosine(slope) MES 
SLOPMEAN Mean slope in degrees. Derived directly from the DEM with the ArcView slope command al Pa 
Lslopmin Minimum slope (In-transformed) 12793 
SLOPMAX Maximum slope T2 
SLOPSTD Standard deviation of slope T2 
Llength Length in meters (In-transformed). Derived by measuring the length of the DLG road or trail GPS points 

vector bisected by the GPS points (see text), DLG 

transportation 
coverage 

PRECIP Mean annual precipitation in mm for the 4x4-kilometer cell in which the canyon segment DES 

falls or weighted mean of the cells in which the canyon segment falls. Derived directly from 

PRISM cell values (Daly et al. 1994) 
EQINMEAN Mean solar insolation in kilojoules at the vernal equinox dix 
EQINMIN Minimum solar insolation at the vernal equinox al, Zi Iss 
EQINMAX Maximum solar insolation at the vernal equinox 1,4,5 
Leqinstd Standard deviation of solar insolation at the vernal equinox (In-transformed) 1,4,5 
SSINMEAN Mean solar insolation in kilojoules at the summer solstice aly Gh, 5 
SSINMIN Minimum solar insolation at the summer solstice iL Zl. fg 
SSINMAX Maximum solar insolation at the summer solstice LP ES 
Lssinstd Standard deviation of solar insolation at the summer solstice (In-transformed) aed at 
EQHRMEAN Mean duration of direct sunlight in hr at the vernal equinox ITAAS 
EQHRMIN Minimum duration of direct sunlight at the vernal equinox 1,4,5 
EQHRMAX Maximum duration of direct sunlight at the vernal equinox il, AL s 
EQHRSTD Standard deviation of duration of direct sunlight at the vernal equinox aL, dL. ds 
SSHRMEAN Mean duration of direct sunlight in hr at the summer solstice 4l. 4 [S 
SSHRMIN Minimum duration of direct sunlight at the summer solstice dL AL 
SSHRMAX Maximum duration of direct sunlight at the summer solstice DANS 
SSHRSTD Standard deviation of duration of direct sunlight at the summer solstice 


1,4,5 
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TABLE 45.1. (Continued) 


Environmental variables for each canyon segment included in step one of the case study, how each was derived, and the data and 


software needed for their derivation. 


j Data and 
Variable Definition software needs? 
EX3MEAN Mean topographic exposure within a 300-meter radius. Compares the elevation of the ed, 

canyon segment with the mean elevation of a specified neighborhood around that segment. 

If the segment is in a valley, value < mean; if on a ridge, value > mean; if on an open slope, 

value = mean, slope z O; if flat, value = mean, slope = O. Derivation: 

elevation of a given cell — (mean elevation of all cells within 300 m) 
EX3MIN Minimum topographic exposure within a 300-meter radius 19523 
EX3MAX Maximum topographic exposure within a 300-meter radius 1L 2, 8] 
Lex3std Standard deviation of topographic exposure within a 300-meter radius (In-transformed) 1223 
EX15MEAN Mean topographic exposure within a 150-meter radius db os 
EX15MIN Minimum topographic exposure within a 150-meter radius 2523 
EX15MAX Maximum topographic exposure within a 150-meter radius al, 2A. el 
Lex15std Standard deviation of topographic exposure within a 150-meter radius (In-transformed) 23 
H20MEAN Mean distance in meters from the centroid of the canyon segment to running or standing water A, Eh 

minimum = 0, maximum set at 500. Derivation: distance from the centroid to the nearest 

water on the DLG hydrology coverage 
H20MIN Minimum distance to running or standing water DESNO 
H20MAX Maximum distance to running or standing water 2,3,6 


aln the digital elevation mode! (DEM), cells are 30 horizontal meters on each side. All insolation and sunlight values were derived with the SolarFlux 


AML script. 


bData and software needs: 1 = DEM, 2 = ArcView, 3 = Spatial Analyst (a plug-in extension for ArcView), 4 = Arcinfo, 5 = SolarFlux Arc Macro 
Language (AML) script (developed and distributed by the GIS and Environmental Modeling Laboratory at the University of Kansas, 
http://www.gemlab.ukans.edu/gemlab/software.htm), 6 = digital line graph (DLG) hydrology coverage. 


hilltops or other prominent topographic features 
(Scott 1975). 

Our multiple regression model performed fairly 
well at classifying locations used to build the model 
with respect to species richness. For example, the 
model correctly classified 85 percent of the thirty- 
three canyon segments with at least twenty-seven but- 
terfly species (i.e., more than 50 percent of the maxi- 
mum number of species recorded from a canyon 
segment), and 50 percent of the ten canyon segments 
with forty or more species (i.e., more than 75 percent 
of the maximum number of species recorded). The 
model's error rate was relatively low. It overestimated 
species richness in seven (20 percent) of the thirty-five 
canyons segments with twenty-six or fewer species; in 
no instance did the model erroneously predict that a 
canyon segment was occupied by forty or more 


species. 


Species Richness: Independent Tests 


We conducted two independent assessments of the ac- 
curacy of our predictive model. First, we examined its 
ability to predict species richness of additional loca- 
tions within the same geographic planning unit (i.e., 
other canyon segments in the Toiyabe Range). The 
model was highly statistically significant (F 33 = 37.5, 
P « 0.001) and explained 53 percent of the variance in 
species richness. It correctly classified fourteen (93 
percent) of the fifteen segments with twenty-seven or 
more species and misclassified four locations with 
fewer than twenty-seven species (21 percent). Al- 
though only one of five locations with forty or more 
species was classified correctly, the misidentification 
rate was a mere 3 percent. 

Managers in our study region would like to predict 
species richness patterns in other planning units (moun- 


tain ranges) that are remote and poorly inventoried but 
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that are nonetheless expected to support multiple uses 
and objectives. Therefore, we conducted a second as- 
sessment of the accuracy of our model using data from 
forty-three canyon segments in the neighboring 
Toquima Range, a 1,750-square-kilometer mountain 
range that lies roughly 10 kilometers to the east of the 
Toiyabe Range. Methods for butterfly inventories and 
segment delineation were identical to those used in the 
Toiyabe Range. For each canyon segment, we used 
GIS to derive the model's predictive variables. We then 
predicted species richness of butterflies using the 
Toiyabe Range model and compared those values to 
our observations. 

The relationship between predicted and observed 
species richness was not statistically significant (F4 45 = 
0.38, P = 0.54). Nor could the model be used for qual- 
itative predictions, in other words, to identify which 


locations have relatively many or relatively few species . 


of butterflies (Spearman rank-correlation: p = 0.05, P 
= 0.73). We can draw at least some ecological infer- 
ences for these results. For example, the Toquima 
Range is smaller and, on average, lower in elevation 
and more arid than the Toiyabe Range. Thus, it spans 
a different range of values with respect to several pre- 
dictive variables. It probably would be more appropri- 
ate to apply the Toiyabe-based model to mountain 
ranges with relatively similar topography and mois- 
ture gradients. The results of our independent tests 
emphasize the importance of conducting validation as- 
sessments before employing a species richness model 
in management planning for locations that were not 
used to build the model. 


Occurrence of the 
Apache Silverspot Butterfly 


Many species that inhabit the Great Basin face sub- 
stantial threats to their viability. Managers of public 
lands are eager to develop conservation plans for these 
taxa that are compatible with multiple-use mandates 
and will preclude the need to invoke the federal En- 
dangered Species Act. To facilitate these goals, it 
would be expedient to predict whether (1) species of 
concern occupy areas that have not yet been invento- 
ried, (2) locations that are not currently occupied nev- 
ertheless appear to be suitable habitat, and (3) man- 


agement actions could render a location suitable for 
the target species. 

In the second step of our case study, we used logis- 
tic regression to determine whether environmental 
variables explained significant variance in the occur- 
rence pattern of Speyeria nokomis apacheana (S. 
nokomis), the Apache silverspot butterfly. Breeding 
populations of this rare butterfly are confined to seeps, 
springs, and riparian areas in the western and central 
Great Basin. We chose to model the occurrence of this 
taxon because not only the butterfly itself but also its 
habitat is the focus of conservation attention. In xeric 
ecoregions, riparian areas both provide water and 
have plant communities with relatively diverse compo- 
sition and structure. Therefore, they tend to receive 
disproportionately heavy use from numerous faunal 
groups, including humans (Kauffman and Krueger 
1984; USGAO 1988; Armour et al. 1991; Thomas 
1991; Dawson 1992; Chaney et al. 1993). Protection 
of riparian-obligate plants and animals is frequently a 
high priority of resource agencies in the Great Basin. 

Step two included all of the 102 locations and inde- 
pendent variables described in step one and incorpo- 
rated one additional variable, distance to the nearest 
canyon segment occupied by S. nokomis. The logistic 
regression model had high statistical significance (de- 
viance goodness-of-fit y? = 22.09, df = 97, P = 1.000) 
and identified five variables that were significantly as- 
sociated with the presence of the species. The variable 
that was most strongly correlated with occurrence of 
S. nokomis, distance to nearest occupied location (im- 
provement x2 = 34.74, df = 1, P < 0.0001), suggested 
that the spatial distribution of the butterfly may affect 
its viability. This result is consistent with our determi- 
nation that S. zokomis is distributed as a metapopula- 
tion in the Toiyabe Range (see *Persistence of the 
Apache Silverspot Butterfly"). Probability of occur- 
rence, not surprisingly, also was greater in locations 
that are relatively close to water (%2 = 23.38, df = 1, P 
« 0.0001), and in locations that have high vernal equi- 
nox insolation (x2 = 11.57, df = 1, P < 0.001), face 
east (X? = 4.95, df = 1, P < 0.05), and are relatively flat 
(x2 = 3.43, df - 1, P< 0.10). | 

Managers obviously cannot affect the insolation or 
aspect of a given location, but our results indicate that 
they can take several steps to help protect the butter- 
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fly. For example, locations with isolated populations 
might receive a lower priority for species conserva- 
tion, in other words, a higher priority for other land 
uses, than locations that are close to occupied areas. 
Another potential management action, particularly in 
areas with extremely wet soils, would be to switch the 
period of heaviest livestock use from early summer to 
late summer or early autumn. This would help prevent 
soil compaction, depression of the water table, and, 
ultimately, conversion of mesic meadows to dry mead- 
ows. If resources (and ecological circumstances) allow 
for restoration of degraded riparian areas, then our 
analyses also may give managers some guidance as to 
which locations, once rehabilitated, are most likely to 
support S. nokomis. 

Because we have searched virtually all accessible 
areas in the Toiyabe Range that might have suitable 
habitat for S. nokomis, and because historical records 
and recent searches indicate that the butterfly is not 
present in any nearby (within 100 kilometers) moun- 
tain ranges, we have not yet conducted an independ- 
ent test of our occurrence model. However, we expect 
to initiate surveys for the butterfly in one or more dis- 
tant Great Basin mountain ranges in upcoming field 
seasons. At that point, we plan to use the model in 
conjunction with GIS-based maps and computational 
tools to identify promising locations for field surveys. 
The outcome of those surveys will provide data with 
which to test the accuracy of the model. 


Persistence of the 
Apache Silverspot Butterfly 


Step two of our model demonstrated that strong cor- 
relations exist between easily measured environmental 
variables and the distribution of Speyeria nokomis 
apacheana. Step two thus yielded a means tentatively 
to classify locations as suitable or unsuitable for the 
species. As a third step, we developed spatially realis- 
tic models to assess probabilities of turnover (colo- 
nization or extinction) at locations that appear to be 
suitable for the taxon. We defined a patch as suitable 
if adult butterflies were recaptured there and/or their 
larval host plants and adult nectar sources were pres- 
ent. Calculating turnover probabilities for individual 


patches of habitat may help practitioners prioritize 
which patches are most critical for viability. 

Field data collected between 1994 and 1999 indi- 
cate that in the Toiyabe Range, S. nokomis exists as a 
traditionally defined metapopulation (many small and 
ephemeral populations) (Hanski 1994a,b; Hanski et 
al. 1995; Harrison and Taylor 1997). Therefore, we 
included both local habitat features (e.g., area, percent 
cover of larval host plants) and larger-scale attributes 
(e.g., isolation) as predictive variables (Table 45.2). 
Values were obtained from recent field measurements 
or GIS. Surface area was the only variable common to 
both step two and step three. In step two, however, 
area was measured as the area of the canyon segment, 
which often included both suitable and unsuitable 
habitat. Patch area, included in step three, encom- 
passed only suitable habitat. 

Because we had data from multiple time steps (an- 
nual censuses), and because those data demonstrated 
that occupancy fluctuates in time, we used logistic re- 
gression to model extinction and colonization patterns 
pooled over four years. We found that, relative to 
patches that remained vacant, patches that were colo- 
nized had significantly (P < 0.05) greater host plant 
abundance. In addition, the bull thistle Cirsium vul- 
gare tended to be present in patches that were colo- 
nized. State transitions for unoccupied patches were 
predicted moderately well with the resulting coloniza- 
tion logit (deviance goodness of fit: y2 = 19.35, df = 
22, P = 0.624). Patches that went extinct were nearer 
to other extinction patches, lacked the lavender thistle 
Cirsium neomexicanum, and had a greater percent 
cover of live vegetation and litter. The logistic regres- 
sion based upon the extinction logit predicted extinc- 
tions quite well (deviance goodness of fit: y? = 9.89, df 
= 34) Pic 00% 

Because steps two and three included different envi- 
ronmental variables, the significant predictors of ex- 
tinction and colonization in the logistic models (step 
three) were distinct from the significant predictors of 
occupancy in the multiple regression model (step two). 
Step two essentially produced a quantitative definition 
of suitable habitat; step three provided additional in- 
formation about potential gradients in the quality of 
suitable habitat (Harrison and Taylor 1997; Thomas 
and Hanski 1997). Thus, step two might provide 
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TABLE 45.2. 


Environmental variables included in logistic regression models 
of extinction and colonization probabilities for Speyeria 
nokomis apacheana (case study, step three). 


Variable Description 


Vegetation variables? 


VINEct Number of host plants (the violet Viola 
nephrophylla) 

THISTct Number of adult nectar sources (various 
species of thistles, see below) 

VINEcov Percent cover of host plants 

THISTcov Percent cover of thistles 

TVC Percent cover of live vegetation 

LITTER Percent cover of dead vegetation 

BARE Percentage of ground not covered by live 
vegetation or litter 

VEGht Mean height of live and standing dead 
vegetation 

CANU Presence of the thistle Carduus nutans 

CIVU Presence of the thistle Cirsium vulgare 

CISC Presence of the thistle Cirsium scariosum 

CINE Presence of the thistle Cirsium neomexicanum 


Other habitat attributes 

AREA Area in square meters (In-transformed) 

Dloc Distance in meters to the closest occupied 
patch for each year 1995-1998, taken from 
censuses in the preceding year 

Dext Distance in meters to the closest neighboring 
extinction site for each year 1995-1998, taken 
from same-year censuses 

DIST Human disturbance (including livestock grazing 
and recreational use) for each year 
1994-1998, categorized as minimal (0), 
moderate (1), or heavy (2) 


PERIM Perimeter in meters 
UTMX Latitudinal coordinate 
UTMY Longitudinal coordinate 


aData collected from 1-square-meter quadrats spaced evenly 
throughout each patch. In most patches, the number of quadrats 
sampled was proportional to patch area. Abundance was recorded as 
the midpoint of one of four abundance classes (1-3, 4-10, 11-30, 
31-100; Fleishman et al. 1996). Percent cover was recorded as the 
midpoint of one of twelve percent cover classes (0-0.99, 1-4.99, 
5-14.99, 15-24.99, 25-34.99, 35-44.99, 45-54.99, 55-64.99, 
65-74.99, 75-84.99, 85-94.99, 95-100). Absolute cover values 
were recorded. In other words, 20 percent cover means 20 percent of 
the quadrat, not 20 percent of the total vegetation cover in that 
quadrat. 


managers with a relatively coarse filter through which 
to evaluate whether certain areas could support a 
focal species, while step three could guide strategies 
for maintaining or improving habitat quality over 
time. 

To test the accuracy of the turnover models for S. 
nokomis, we will use the colonization and extinction 
logits to calculate probabilities of colonization and ex- 
tinction for each of the thirty-nine patches of suitable 
habitat. We will then compare those “predicted” . 
probabilities to turnover events that are observed in 
2000 and beyond and test whether significantly more 
predictions are accurate than would be expected by 
chance. We are optimistic about the outcome of these 
tests in light of the models’ ability to account for vari- 
ation in the data used for their construction. Some po- 
tential deviations between predicted and observed 
states may be explained by our recent finding that in 
the Toiyabe Range metapopulation of S. nokomis, dif- 
ferent aspects of habitat quality affect turnover pat- 
terns in different years. We have also discovered that 
factors associated with turnover in single years are not 
always reflected in multiple-year analyses. We suggest 
that there is likely a spectrum of relative variation in 
habitat quality along which metapopulations fall. 
Toiyabe Range S. nokomis appear to lie at the higher 
end of this gradient, opposite from several systems 
that have laid the groundwork for current metapopu- 
lation paradigms (Thomas and Hanski 1997). 
Nonetheless, our models may be quite useful for iden- 
tifying habitat attributes that may affect turnover in 
some years, and we believe that our overall modeling 
approach holds considerable promise for management 
of diverse “patchy” populations. 


Discussion 


Management at the regional level has multiple objec- 
tives, including maintenance of native species diversity, 
conservation of rare species, and tracking the effects of 
ecological changes on biological communities. No sin- 
gle tool or method is appropriate for all management 
challenges (Maurer, Chapter 9; Van Horne, Chapter 4). 
Instead, it is important to develop and validate a range 
of complementary modeling approaches that can be 
employed in various scenarios. From a technical stand- 
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point, it can be extremely difficult to develop a model 
that successfully predicts species richness patterns in a 
geographic region other than that used to construct the 
model, yet the potential value of such a model is mani- 
fest. Validation is essential if one hopes to apply a sin- 
gle model across many planning units. 

Our modeling framework takes somewhat different 
approaches to predicting species richness and to mod- 
eling occupancy and, particularly, to modeling persist- 
ence probabilities of individual taxa. Our species rich- 
ness and occupancy models rely especially heavily 
upon variables that can be derived efficiently with 
GIS. The reasons for this are largely pragmatic. Al- 
though modeling species distributions as a function of 
location-specific variables related to the resource re- 
quirements of the focal assemblage frequently yields 
successful results (Hanski and Gilpin 1997), obtaining 
the necessary data can present substantial logistic ob- 
stacles. Furthermore, planning and decision-making 
for federal lands is often based in a central office—a 
*headquarters." Personnel may have less-detailed 
knowledge of local environmental conditions or 
species distributions than do staff in field offices, but 
they are more likely to have access to computerized 
mapping and planning tools, including digital and re- 
motely sensed sources of data. A central office also 
may be the best location for tracking and archiving 
data on landscape features such as percent cover of 
major vegetational communities and road density. 

Managers can overlay predictions of species rich- 
ness with predicted occurrences of species of interest 
and data on land-use patterns. The resulting maps 
may be utilized for gap analysis (e.g., Caicco et al. 
1995; Edwards et al. 1995), to prioritize field investi- 
gations, and to guide land-use planning (Cogan, 
Chapter 18). If road closures are politically con- 
tentious, for example, then it may be possible to mini- 
mize vehicle restrictions in locations with a baseline 
topography that is unlikely to harbor a relatively high 
number of native species. 

The multiple regression analysis in step one of our 
model obviously does not examine the requirements 
of individual species of management interest, nor does 


it address the probabilities of persistence of those 
species at certain locations. Environmental variables 
that influence species richness may have little effect on 
the distribution of a particular rare taxon (Cody 
1986; Thomas 1995; Fagan and Kareiva 1997; Freitag 
et al. 1997), and even locations with relatively low 
species diversity may serve as key supports for popula- 
tions in species-rich areas (Fleishman et al. 2000). In 
addition, management for individual species often 
must rely more heavily upon local and/or species- 
specific data than upon remotely sensed variables. 
Step two of our model begins to address the needs of 
target species using a readily available set of environ- 
mental variables. It highlights some ecological condi- 
tions that are associated with the presence of the 
species and by extension suggests management actions 
that may promote occupancy. Step three incorporates 
relatively detailed habitat and species data in order to 
generate more-refined predictions and associated man- 
agement recommendations. 

As we noted, some species are of interest to practi- 
tioners because of their potential to serve as surrogate 
measures of high species richness or ecosystem func- 
tion. Some species also may be useful for quantifying 
the biological effects of known ecological changes, 
particularly if those changes can be modified by man- 
agement. One practical obstacle to using so-called in- 
dicator or umbrella species is that their occupancy can 
fluctuate independently of variation in human- 
influenced environmental parameters. By identifying 
some of the factors that may affect occupancy and 
persistence of potential indicator species, steps two 
and three of our modeling framework can help test 
whether the species is likely to function as a scientifi- 
cally reliable surrogate. 

Land-use planners rely upon predictive models and 
other tools to inform management strategies in the ab- 
sence of detailed data. Models that can address a 
range of ecological phenomena, from species richness 
to occurrence patterns of rare or invasive taxa, may 
assist practitioners in delineating land uses, prioritiz- 
ing field research, and anticipating the outcome of var- 
ious management alternatives. 
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Discontinuity in Stream-fish Distributions: 
Implications for Assessing and 
Predicting Species Üccurrence 


Paul L. Angermeier, Kirk L. Krueger, and C. Andrew Dolloff 


Overview of Species Occurrence 


Understanding the patterns and causes of species dis- 
tributions through time and space is a central goal of 
ecology. Development of models to predict species oc- 
currence (i.e., presence or absence) enhances ecologi- 
cal knowledge and conservation effectiveness. Predict- 
ing occurrence for unsampled times or places requires 
the ability to explain alternations between presence 
and absence along environmental continua within a 
species’ geographic range. Moreover, such explana- 
tions must distinguish between real discontinuity and 
the perceived discontinuity stemming from modeling 
or sampling errors. 

Based on a review of the literature and our experi- 
ences, we propose that discontinuity in species distri- 
butions is regulated by environmental suitability and 
colonist availability. Environmental suitability com- 
prises both abiotic and biotic factors. A broad array of 
physical and chemical features determines habitat 
(abiotic) suitability; greater suitability facilitates conti- 
nuity in distributions. In contrast, intense biotic inter- 
actions promote discontinuous distributions. Colonist 
availability comprises three factors: population size, 
fluctuation in population size, and dispersal ability. 
We define dispersal as movement beyond the home 
range, including migratory and nonmigratory move- 
ment. Greater population size and dispersal ability fa- 
cilitate continuity in species distributions, whereas 


wide fluctuations in population size promote disconti- 
nuity. Explicitly incorporating these five factors into 
explanations of species distributions should enhance 
the reliability of predictive models. 

The factors regulating discontinuity interact in 
complex ways. For example, habitat suitability may 
not predict occurrence if species-habitat associations 
are density-dependent. Both chance and biotic interac- 
tions can cause density-dependence, but their relative 
importance is rarely examined. Correlations between 
species abundance and occurrence are expected by 
chance, even with no habitat selection or biotic inter- 
actions (Wright 1991). At low population densities, 
the association between a preferred habitat configura- 
tion and species presence may be weak, even if prefer- 
ence is strong. Conversely, at high densities, preferred 
configurations may not accommodate all individuals, 
and a species might occupy a broad range of habitats, 
including less-preferred configurations. Similar distri- 
bution patterns commonly are predicted by models 
that assume animals make informed, optimal choices 
based on habitat suitability (e.g., Fretwell and Lucas 
1969). That is, intraspecific competition could expand 
the range of suitable patches, and habitat selectivity 
(as measured by presence/absence) would appear re- 
duced. However, for suitable habitat to be used, it 
must be accessible (i.e., within a species’ dispersal abil- 
ity). Thus, occurrence is also related to proximity to 
centers of abundance (Legendre 1993; Hanski 1994b), 
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where proximity is defined in terms of dispersal 
ability. 

Models to predict species occurrence are built from 
sample observations of presence versus absence. Be- 
cause observed absences are more difficult to interpret 
than observed presences, the interpretation of ob- 
served absences is more likely to contro! model relia- 
bility. False absences are caused by three types of 
error: (1) failure to sample appropriate spatial or tem- 
poral strata, (2) inadequate sampling effort, and (3) 
ineffective or inappropriate survey technique. Herein, 
we focus on the first two error types because we be- 
lieve that they are most limiting to the understanding 
of discontinuity in species distributions. 

Although observations of occurrence are instanta- 
neous, the predictions of interest typically involve ex- 
trapolation to much longer time frames (e.g., years). 
To be predictive in “new” places and times, models 
must incorporate knowledge of spatial and temporal 
variation in occurrence, which is accomplished via as- 
sessment of occurrence over ranges of the values 
taken by regulating factors. Incorporation of larger 
ranges should produce more generally applicable 
models. Relations between occurrence and some reg- 
ulating factors (e.g., habitat suitability) are commonly 
incorporated into predictive models, whereas influ- 
ences of other factors (e.g., population fluctuation) on 
occurrence are rarely modeled. In part, this disparity 
reflects the greater feasibility of measuring physico- 
chemical conditions relative to estimating population 
parameters and biotic processes. In this chapter, we 
summarize knowledge of spatial and temporal varia- 
tion in habitat suitability for stream fishes as might be 
used in building models to predict species occurrence. 
Where possible, we also discuss documented and hy- 
pothesized influences of other regulating factors on 
occurrence. 


Stream-fish Distributions 


Models to predict occurrence of stream fishes are in- 
teresting because stream features are highly variable 
through time and space and because modern sampling 
technology often allows confident assessment of oc- 
currence. Local assemblages of stream fishes are sub- 
sets of regional species pools; local species have been 


“filtered” by a hierarchical array of limiting factors re- 
lated to dispersal ability, physiological tolerance, life- 
history, and biotic interactions (Angermeier and Win- 
ston 1998). Spatiotemporal variation in the filtering 
process generates discontinuous distributions over a 
broad range of spatial scales (extent and grain). Confi- 
dence in assessments of species absence accrues be- 
cause system boundaries (i.e., streambanks) are fre- 
quently obvious and because sampling effectiveness is 
well studied (Hankin and Reeves 1988; Bayley et al. 
1989; Angermeier et al. 1991; Dolloff et al. 1993; En- 
sign et al. 1995; Simonson and Lyons 1995; Thurow 
and Schill 1996). Many sampling protocols provide 
the accuracy and known variance needed to develop 
predictive models. 

The grain at which discontinuity in a species' distri- 
bution is relevant depends on the question being 
asked. Distribution within a pool might be especially 
relevant to individual survival, whereas distribution 
across a landscape might be especially relevant to 
species conservation. Detectability and pattern of oc- 
currence are functions of study grain (Wiens 19892). 
Moreover, the range of relevant scales is species- 
specific, bounded by the smallest scale at which 
patches are differentiated (species grain) and the 
largest scale of heterogeneity perceived (species extent; 
Kotliar and Wiens 1990). Wide-ranging, mobile 
species (great dispersal ability) tend to perceive mo- 
saics at a given scale as more homogeneous than do 
more sedentary species (Kotliar and Wiens 1990). 
Multiscale analyses generally offer the most powerful 
approach for understanding stream-fish occurrence 
patterns (Lohr and Fausch 1997; Watson and Hillman 
1997; Wiley et al. 1997; Angermeier and Winston 
1998; Dunham and Rieman 1999; Torgersen et al. 
1999). 

Frissell et al. (1986) provide a hierarchical frame- 
work for stream habitat; levels increase in grain from 
microhabitat to stream system and reflect basic geo- 
morphic processes. We adopted their two largest-grain 
levels, which we call reach and watershed. A reach is a 
continuous array of pool-riffle configurations (101 to 
10? meters) and is bounded by major geomorphic 
(e.g., waterfall or hydrologic (e.g., confluence with 
another reach) discontinuities. A watershed is a net- 
work of reaches and intervening land (106 to 108 
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square meters) with a single outflowing reach of third- 
or fourth-order (Strahler 1957). We also consider a 
larger-grain level: landscape, which comprises a clus- 
ter of watersheds (107 to 10? square meters) with a 
single outflow. Landscapes are delineated by bound- 
aries of major river drainages or physiographic 
provinces; they are relatively homogeneous with re- 
spect to phylogenetic history (Hocutt and Wiley 1986; 
Mayden 1992) and habitat availability. Geographic 
ranges of many fish species coincide with drainage 
and/or physiographic boundaries. We believe that the 
three study grains considered in this chapter represent 
the scales most relevant to the conservation of stream 
fishes. 

We address three major objectives. First, we review 
major spatiotemporal strata related to discontinuity in 
stream-fish distributions at three study grains (reach, 
watershed, and landscape). Samples from key strata 
(the next-smaller grain-level) are critical to assessing 
whether a species is present or absent in a study unit 
at the grain of interest. Because patterns of occurrence 
among sampled units provide a basis for predicting 
occurrence in unsampled places or times, we also re- 
view, for each grain, how models are used to predict 
stream-fish occurrence. Third, we develop a concep- 
tual model, including several hypotheses, for under- 
standing the relative importance, over a range of 
scales, of the five factors regulating discontinuity in 
species distributions. 


Patterns of Occurrence in a Reach 


Reliable assessments of species occurrence must be 
based on samples from appropriate spatial and tempo- 
ral strata. Stratified sampling of habitat increases the 
probability of detecting species relative to the proba- 
bility of detection with random sampling (McArdle 
1990). The most commonly studied strata in reaches 
are riffles and pools (Table 46.1), which are spatially 
distinctive configurations of depth, velocity, substrate 
type, and cover availability. Most fish species and life 
stages regularly use only a subset of the available con- 
figurations. Occurrence may also be influenced by 
habitat-unit size or proximity to dispersers. Biotic in- 
teractions (e.g., competition, predation) commonly in- 
duce shifts in habitat use. For example, benthic species 


may exclude each other from riffles (Baltz et al. 1982; 
Taylor 1996) or pool-dwelling piscivores may restrict 
their prey to riffles or pool margins (Power et al. 
1985; Schlosser 1987). 

Although less studied than spatial strata, there are 
several important temporal strata in stream reaches 
(Table 46.1). Environmental suitability of a reach may 
fluctuate seasonally or vary sporadically with major 
disturbances (e.g., flood, drought). Annual variation 
in flow regime and longer-term variation associated 
with succession can strongly influence stream-fish re- 
productive success and occurrence. In addition, some 
species (e.g., anadromous fishes) may use a reach only 
seasonally for a particular life-history function (e.g., 
spawning), whereas others may be present only during 
periods of high abundance. Multiyear studies indicate 
that breadth of habitat use by some fishes is correlated 
with population density, presumably because of in- 
traspecific competition. 

The relationship between abundance and distribu- 
tion strongly influences the sampling effort needed to 
reliably assess species occurrence. The probability of 
detecting a species is generally correlated with popula- 
tion size (Nichols et al. 19982). Increasing the sam- 
pling extent decreases the likelihood of a false ab- 
sence, especially for uncommon species. Because many 
fish species are distributed sporadically along stream 
reaches, estimates of species richness are low and im- 
precise when sampling effort is small (Lyons 1992; 
Angermeier and Smogor 1995). The effort needed to 
detect most species varies among streams and is prob- 
ably related to species-specific habitat selectivities, 
population densities, and habitat complexity. Al- 
though largely undocumented, analogous relations be- 
tween number of reaches or watersheds sampled and 
species richness estimates for watersheds and land- 
scapes, respectively, also probably exist. For accurate 
assessments of occurrence, we generally expect study 
units with many sporadically occurring species to re- 
quire more sampling effort than do units with fewer 
such species. 


Predicting Occurrence in a Reach 


Predicting species occurrence depends on estimates of 
environmental suitability and colonist availability, as 
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TABLE 46.1. 


Important strata associated with stream-fish occurrence at three study grains. Fish distribution is also influenced by a broad array of 
water chemistry variables, such as pH, temperature, dissolved oxygen, and toxins, but these are not summarized here. Selected 


references are provided for most strata. 


Grain Spatial strata 


Temporal strata 


Reach 
Near to versus far from dispersers 
(Schlosser 1995, 1998) 
Large versus small habitat unit (Taylor 1997) 


Watershed * [arge versus small reach (Burton and Odum 
1945; Sheldon 1968) 
High versus low position (Gorman 1986; 
Osborne and Wiley 1992) 
Flow regime types (Poff and Allan 1995) 


Landscape * Large versus small watershed 


(Rieman et al. 1997; Dunham and Rieman 1999) 


High- versus low-elevation watershed 


(Thurow et al. 1997; Dunham and Rieman 1999) 


Flow regime types (Thurow et al. 1997) 

Many versus few human modifications (Thurow 
et al. 1997; Dunham and Rieman 1999) 
Connected versus isolated watershed 


* Riffles, pools, and subtypes (Hawkins et al. 1993) 


Seasons (Matthews and Hill 1979) 

Before versus after disturbance (Sedell et al. 1990; 
Schlosser 1995) 

Before versus after atypical flow regime (Freeman et al. 
1988; Strange et al. 1993) 

Successional stages (Snodgrass and Meffe 1998; 
Schlosser and Kallemeyn 2000) 

Life-history stages 

High versus low population size (Fraser and Sise 1980; 
Moyle and Vondracek 1985) 


Seasons (anadromous species) 

Before versus after catastrophic disturbance 
(Reeves et al. 1995) 

Life-history stages 

High versus low population size 


Seasons (long-distance anadromous species) 

Before versus after anthropogenic transformation (Moyle 
and Williams 1990; Angermeier 1995) 

Before versus after climate change (Matthews and 
Zimmerman 1990; Poff et al. 2001) 


(Dunham et al. 1997; Dunham and Rieman 1999)) 


well as knowledge of the temporal variation in these 
estimates. Environmental suitability of a stream reach 
can be described by variables related to physiological 
tolerance (e.g., pH, temperature), habitat preference 
(e.g., depth, patch configuration), biotic interactions 
(e.g., abundance of predators or competitors), or any 
combination of these. Colonist availability in a reach 
is a function of past and current population abun- 
dance at the watershed scale and of accessibility to 
dispersers. 

Although abundance of species or life stages is often 
estimated (Baltz 1990), we know of no predictive mod- 
els that incorporate most of the factors regulating dis- 


continuity in the distribution of fishes. The models 


most commonly used to infer fish occurrence in reaches 
are those developed via the instream flow incremental 
methodology (IFIM), a complex protocol that uses 
models of habitat suitability and hydraulics to predict 
how habitat quality changes with variation in dis- 
charge (Stalnaker et al. 1995). Users of the IFIM often 
assume that fish abundance is proportional to habitat 
availability, thereby implicitly assuming that other fac- 
tors do not influence abundance (or occurrence). Sam- 
pling protocols such as the basinwide visual estimation 
technique (BVET; Hankin and Reeves 1988; Dolloff et 


al. 1993; Fig. 46.1) can generate empirical estimates of 


fish abundance with known variances for reaches in a 
watershed. This two-stage stratified-random protocol 
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Figure 46.1. Number of fish counted by a diver in systemati- 
cally selected pools and riffles in three reaches of White Oak 
Run, Shenandoah National Park. Counts were obtained using 
the basinwide visual estimation technique. Reaches are 
bounded by the confluence with Luck Hollow and a waterfall. 
Inset shows the density of fish over the entire survey dis- 
tance. Densities for riffles or pools or reaches could also be 
estimated. 


has mostly been used to estimate salmonid abundance 
but is also effective in estimating abundance of other 
stream fishes (e.g., Leftwich et al. 1997). The BVET fo- 
cuses on spatial strata, but analogous protocols could 
be developed to sample temporal strata. 

Many studies have addressed whether species-habi- 
tat associations observed in reaches of one system can 
predict fish occurrence in reaches of another system. 
Broadly transferable models would obviate the need 
to develop many system-specific models. A few models 
of fish-habitat associations are broadly transferable 
(e.g., Belaud et al. 1989) but most are not, especially 
across major drainages (Layher et al. 1987; Leftwich 
et al. 1997) or regions (Bowlby and Roff 1986; Hu- 
bert and Rahel 1989). Similarly, fish-habitat associa- 
tions in a reach can vary among years (Angermeier 
1987; Bozek and Rahel 1992). To enhance model 
transferability, researchers will need to explicitly in- 


525 


corporate factors that limit fish distribution and abun- 
dance. Rather than relying on population density to 
indicate habitat quality (Van Horne 1983), mechanis- 
tic links between habitat configurations and fitness 
measures (e.g., survival, reproduction) should be iden- 
tified (e.g., Dolloff 1987; Schlosser 1998). 


Patterns of Occurrence in a Watershed 


The most commonly studied strata in watersheds are 
related to stream size (Table 46.1), which integrates 
variation in a suite of physicochemical features, such 
as temperature and flow regimes, and depth and veloc- 
ity frequencies. Other spatial strata that influence 
stream fish distribution include position in the water- 
shed and flow regime type (Table 46.1). Biotic interac- 
tions that exclude species from reaches are scarcely 
documented, but complementary distributions of 
some species-pairs are attributable to competition 
(Winston 1995; Taylor 1996; Clark and Rose 1997) 
or predation (Fraser et al. 1995). 

Environmental suitability of a watershed may fluc- 
tuate seasonally or vary sporadically with catastrophic 
disturbances (Table 46.1). Watershed-wide extinctions 
induced by disturbances have long recurrence intervals 
(e.g., centuries) but may reverse themselves if suitable 
habitat remains or develops and if connections to dis- 
perser sources exist (Reeves et al. 1995). In addition, 
some species (e.g., anadromous fishes) may use a wa- 
tershed only seasonally for a particular life-history 
function (e.g., spawning). Density-dependence in the 
occurrence of fishes among watersheds is undocu- 
mented, but relations between occurrence and water- 
shed size and isolation (see Occurrence in a Land- 
scape) suggest influences of population stability and 
dispersal ability, respectively (i.e., colonist availabil- 
ity). Density-dependent occurrence may be especially 
likely for species that have declined dramatically be- 
cause of human impacts. Long-term dynamics of oc- 
currence among watersheds might be expected for 
species with metapopulation structure, but empirical 
evidence of such structure is scarce (Schlosser and 
Angermeier 1995; but see Rieman and Dunham 
2000). In any case, the ecological significance of a 
species! observed absence from a watershed (in a land- 
scape where the species in present) is equivocal 
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because long-term dynamics may obscure the water- 
shed’s suitability. 


Predicting Occurrence in a Watershed 


Predictions of species occurrence in watersheds are 
rare. Environmental suitability of a watershed might 
be described by the same variables used in reach-level 
models, but variables would be “scaled up” to repre- 
sent cumulative suitability of component reaches by 
using frequencies of various reach-types in the water- 
shed. In addition, suitability may be affected by key 
ecosystem properties such as flow, nutrient, sediment, 
and thermal regimes. The importance of certain reach- 
type combinations or juxtapositions to stream-fish oc- 
currence is largely unexplored (but see Labbe and 
Fausch 2000) despite recognition that many species 
require multiple habitat configurations to complete 
their life cycles (Schlosser and Angermeier 1995). In 
part, this information gap reflects the common belief 
that most stream fish move little (less than 500 meters) 
during their lifetimes (Gerking 1959; but see Gowan 
et al. 1994). However, recent findings of extensive 
movement suggest that some fish select and use habi- 
tat at large spatial scales (Fausch and Young 1995; 
Schlosser 1995; Gowan and Fausch 1996b). A few 
models based on watershed features that reflect both 
environmental suitability and colonist availability pre- 
dict occurrence of salmonids in Pacific Northwest wa- 
tersheds (Rieman et al. 1997; Thurow et al. 1997; 
Dunham et al., Chapter 26). We anticipate that analo- 
gous models will be useful for other taxa and regions. 

Some predictions of fish occurrence in watersheds 
are based on iterative assessments of reaches rather 
than on watershed descriptors (i.e., models with 
reach-level grain are extended to watersheds). How- 
ever, the influence of watershed-level land use is in- 
creasingly recognized in characterizations of reach- 
level suitability (Richards et al. 1996; Roth et al. 
1996). In reach-grain models (e.g., IFIM applications), 
predicted presence in any reach means predicted pres- 
ence in the watershed. The reliability of such predic- 
tions depends on the transferability of fish-habitat as- 
sociations from the reaches where a model was 
developed to the reaches of the watershed being as- 


sessed. Thus, this approach has the same transferabil- 
ity constraints as discussed for reach-level predictions. 

Predicting occurrence in an unsampled watershed 
should be based on models developed from samples of 
similar watersheds in the landscape. Estimating pre- 
diction accuracy requires a probabilistic sampling pro- 
tocol (e.g., stratified-random). Presumably, a protocol 
analogous to the BVET could be developed for sam- 
pling watersheds in a landscape. However, because 
rare fishes are often distributed capriciously across 
landscapes, highly accurate predictions of their occur- 
rence in watersheds are unlikely (Boone and Krohn 
1999), 


Occurrence in a Landscape 


Spatial and temporal strata associated with fish occur- 
rence in landscapes are scarcely studied. Occurrence 
of some species is related to watershed size, elevation, 
or flow regime and may reflect occurrence in nearby 
watersheds (Table 46.1). Occurrence is also affected 
by human modifications such as dams and roads. We 
know of no species-pairs with complementary distri- 
butions among watersheds attributable to biotic inter- 
actions. Some long-distance migrants may use a land- 
scape only seasonally, but the most obvious temporal 
strata are related to human impacts on environmental 
suitability (Table 46.1). Specific mechanisms are 
poorly understood, but anthropogenic changes in 
flow, nutrient, sediment, and thermal regimes across 
landscapes are primarily responsible for the pervasive 
endangerment and extinction of North American 
stream fishes. 

Analyses of fish occurrence in landscapes are de- 
scriptive rather than predictive. Explanations for dis- 
continuous distributions of fishes among landscapes 
are actively debated by zoogeographers and systema- 
tists (Hocutt and Wiley 1986; Mayden 1992), but we 
know of no models that predict fish occurrence in un- 
sampled landscapes. However, some authors have 
tested predictions of occurrence in landscapes based 
on reach-gain models (Bowlby and Roff 1986; Layher 
et al. 1987; Hubert and Rahel 1989; Leftwich et al. 
1997). We speculate that certain combinations or jux- 
tapositions of watersheds may regulate metapopula- 
tion persistence in some landscapes. 
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Synthesis and Hypotheses 


Stream-fish distributions can be discontinuous at any 
spatial scale, but the influence of each regulating fac- 
tor on species absence probably varies greatly with 
study grain (Fig. 46.2). We expect a factor's influence 
on absence to be proportional to the range of varia- 
tion at a given grain, and we hypothesize that these in- 
fluences exhibit general patterns from microhabitat to 
regional scales. Habitat suitability, the focus of most 
occurrence models, strongly influences species absence 
at all scales (Fig. 46.2A). For stream fishes, habitat 
suitability typically is based on structural variables 
such as water depth and velocity, substrate and cover 
type, and patch size and juxtaposition as well as on a 
suite of water chemistry variables such as tempera- 
ture, pH, and dissolved oxygen concentration. In con- 
trast, biotic interactions such as competition, preda- 
tion, and disease are much more likely to cause species 
absence at small scales than at large scales (Fig. 
46.2B). - 

Population size and variability have opposing ef- 
fects on species discontinuity, but we expect the influ- 
ence of both to decrease with increasing study grain 
(Fig. 46.2C). Population size at a given grain will gen- 
erally be correlated with the number of spatial strata 
(next-smaller grain) occupied. However, as grain in- 
creases, optimal patch choice becomes increasingly in- 
feasible because of the escalating cost/risk of gathering 
relevant information (Lima and Zollner 1996). Fur- 
thermore, in habitat units large enough to support 
multiple populations, asynchrony in population fluc- 
tuations could preclude unit-wide density-dependence 
in occurrence. At smaller grains, we expect patch 
choice driven by availability of resources (e.g., food, 
spawning sites) to be more density-dependent than 
that driven by physiological tolerance. Population in- 
stability promotes species absence via the recurrence 
of small population size; fluctuations at large study 
grains are likely to be less severe or frequent than fluc- 
tuations at small grains. We expect occurrence models 
for species with stable populations to be more trans- 
ferable than models for species with highly variable 
populations. 

Barriers to movement can cause species absence at 
any study grain. Barrier permeability, which is a 
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Figure 46.2. Hypothesized relations between study grain (size 
of habitat units) and the influence of habitat suitability (A), bi- 
otic interactions (B), population size or variability (C), and dis- 
persal limitations (D) on species absence from a habitat unit. 
Relations are based on the assumption that the species is 
present in the next-larger study grain. 


function of behavior, size, shape, dispersal frequency 
and distance, and other factors, is poorly understood 
for most fishes. Because barriers at small study 
grains tend to be more temporary than barriers at 
large grains (e.g., Schlosser and Kallemeyn 2000), we 
expect the influence of dispersal limitations to in- 
crease with grain (Fig. 46.2D). However, the grain 
at which this influence is maximal is smaller for poor 
dispersers than for good dispersers. Effects of 
artificial barriers (e.g., dams) on distributions of 
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TABLE 46.2. 


Values (and probabilities) for Kendall’s tau concordance between proportions of habitat units from which 


fish species were absent at four study grains. 


Study grains? Channel units® Reaches^ HPUs4 Drainages® 
Beaver Creek (31 species) 
N 18 22 9 6 
Channel units 0.499 (0.0002) 0.027 (0.847) 0.122 (0.380) 
Reaches 0.431 (0.002) 0.238 (0.091) 
HPUs 0.387 (0.008) 
Bernards Creek (21 species) 
N am 15 9 6 
Channel units 0.284 (0.098) 0.114 (0.510) 0.281 ( 0.122) 
Reaches 0.390 (0.026) ~0.170 (0.354) 
HPUs —0.032 (0.863) 


?The species lists compared came from samples of two streams in the James River drainage, Beaver Creek and 


Bernards Creek. 


bData for channel units came from samples of contiguous riffles and pools (Angermeier and Smogor 1995). 

eData for reaches came from a large statewide database of fish collections. Only those reaches associated with the 
same HPU and a fish collection were included in analyses for Beaver Creek and Bernards Creek. 
dHydrologic-physiographic units (HPUs) are the portions of U.S. Geological Survey 8-digit hydrologic units that intersect 
a single physiographic province (see Angermeier and Winston 1998). Data for HPUs came from a large statewide 
database of fish collections. Only those HPUs with twenty or more fish collections were considered. 

Data for drainages came from Jenkins and Burkhead (1994) and are limited to native species. Only those Atlantic- 


slope drainages well represented in Virginia were considered. 


migratory fishes are well documented, but effects are 
likely underestimated for nonmigrants (but see Win- 
ston et al. 1991), which need to track spatiotemporal 
variation in resource availability and habitat suitabil- 
ity to persist. 

Our conceptual framework builds on the notion 
(e.g., Watson and Hillman 1997; Dunham and Rie- 
man 1999) that the most useful models of species oc- 
currence are hierarchical, beginning with patterns oc- 
curring over large areas. Occurrence at a given study 
grain is uninteresting unless presence at the next- 
larger grain is known or expected. The hypothesized 
relations in Figure 46.2 suggest which regulating fac- 
tors may be most influential at a given grain. We ex- 
pect occurrence among landscapes to be regulated pri- 
marily by habitat suitability and dispersal barriers, but 
occurrence among reaches to be regulated by habitat 
suitability, biotic interactions, and population size and 
stability. 

Factors may interact at any grain, but we expect 
probability of occurrence generally to be independent 
across grains. That is, infrequent presence at one grain 


does not imply infrequent presence at another grain. 
As a preliminary test of this hypothesis, we assessed 
concordance of fish occurrences across four study 
grains in Virginia (Table 46.2). Grain size ranged from 
channel unit (e.g., riffle, pool) to major river drainage. 
Overall, there was some concordance between species’ 
tendency to be rare (or widespread) at one grain and 
their tendency to be rare (or widespread) at another 
grain (four of twelve Kendall's tau values » 0.35; 
Table 46.2). However, all significant concordance oc- 
curred between the most-similar-sized grains, a pat- 
tern that may be an artifact of spatial autocorrelation 
(Legendre 1993). 

Predictive models of species occurrence are impor- 
tant conservation tools. They help managers to (1) iden- 
tify likely occupied areas for poorly surveyed species, 
(2) assess status of species relative to historical distribu- 
tion, (3) identify the best areas for re-introducing 
extirpated species, (4) justify protecting areas that are 
suitable for, but currently unoccupied by, valued 
species, and (5) predict the extent of invasion by colo- 
nizing species. In addition, hierarchical models of oc- 
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currence help managers focus on the most relevant 
scales and regulating factors thereby enhancing cost- 
effectiveness. For most species, occurrence in reaches 
and watersheds should be of greatest interest to man- 
agers because the time frames typically associated 
with population dynamics at those scales (years to 
decades) are the most tractable. However, highly mi- 
gratory species (e.g., diadromous fishes) do choose 
among landscapes; understanding occurrence of such 
species among landscapes will also facilitate effective 
management. Moreover, if metapopulation dynamics 
are important to the persistence of most stream fishes, 
understanding occurrence in landscapes will be critical 
to long-term conservation. In this context, predictive 
models can shed light on the conservation significance 
of particular presences and absences. An area that is 
primarily a population sink (i.e., species is present but 
not self-sustaining) is less conservation-worthy than a 
population-source area (Schlosser and Angermeier 
1995), and perhaps less conservation-worthy than 
some unoccupied, but environmentally suitable, areas 
(Rieman and Dunham 2000). Similarly, because 
stream-fish populations typically transcend individual 
reaches, occurrence at reach grains, where most mod- 
els focus, has less conservation significance than oc- 
currence at larger study grains. 


Research Needs 


The reliability of assessing or predicting stream-fish 
occurrence could be enhanced substantially by addi- 
tional research in three interrelated areas. First, better 
understanding of temporal variation in species distri- 
bution and abundance is needed, including analyses of 
variation in habitat use in the context of population 
fluctuations. Such analyses would clarify the impor- 
tance of density-dependence but would likely require 
much longer-term data than are typically collected 
(e.g., Wiley et al. 1997). Failure to incorporate tempo- 
ral variation into fish-habitat models can cause them 


to be overfitted because temporal variation erro- 
neously appears as spatial variation (Wiley et al. 
1997). Second, more insight into the spatial and tem- 
poral scales most relevant to stream fish population 
dynamics, including those of metapopulations, is 
needed to help identify key strata to include in occur- 
rence models. Especially lacking is knowledge of spa- 
tiotemporal dynamics of fish dispersal, including esti- 
mates of movement frequency and distance and 
barrier permeability. Finally, additional conceptions of 
hierarchical, multiscale models are needed to under- 
stand how the relative importance of various factors 
that regulate species occurrence changes with study 
grain and to understand whether occurrence patterns 
observed at one grain influence those, or can be ex- 
pected, at other grain sizes. 

Our ability to assess or predict occurrence has im- 
portant implications for how we prioritize areas for 
conservation and for our use of fish communities as en- 
vironmental monitors. Acquiring knowledge of occur- 
rence is a small but essential step toward achieving con- 
servation goals. Intense societal demands on ecosystems 
dictate that only the most irrefutable knowledge of oc- 
currence and persistence will be tolerated as justifica- 
tion for pursuing aggressive conservation of fish species 
or biotic integrity. In this context, the research areas 
noted above become crucial. Without basic understand- 
ing of the factors regulating distribution, scientifically 
sound recommendations on how to conserve fish popu- 
lations and communities are unlikely. As our under- 
standing of species discontinuity advances, so will the 
effectiveness of our conservation efforts. 
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A GIS-based Habitat Model for 
Wood Thrush, Hylocichla mustelina, 
in Great Smoky Mountains National Park 


Susan A. Shriner, Theodore R. Simons, and George L. Farnsworth 


anagement of wildlife populations requires 

knowledge of the ecology and habitat prefer- 
ences of the species of interest. Typically, the majority 
of habitat information available for Neotropical mi- 
gratory birds has been collected at a microhabitat 
scale. However, managers interested in identifying key 
areas for a given species rarely have the resources to 
collect detailed microhabitat data over broad regions. 
Therefore, identifying habitat variables that are both 
readily available over large areas and correlated to 
species occurrence is essential to effective management 
of large areas (Simberloff 1988). Topographic indexes 
based on readily available data (digital elevation mod- 
els) have been successfully applied to models devel- 
oped to predict plant occurrence (Wilds 1996; Wiser 
et al. 1998) but are not often used to predict animal 
occurrence. Nonetheless, these indexes may be useful 
predictors of animal occurrence over broad regions. In 
this chapter, we present a logistic regression model for 
wood thrush (Hylocichla mustelina) in Great Smoky 
Mountains National Park based on topographic and 
other readily available habitat variables. 

As the result of several studies showing possible 
population declines (Robbins et al. 1989b; Askins et 
al. 1990), Neotropical migratory songbirds have be- 
come major research species in conservation biology. 
In particular, the wood thrush has been the focus of 
many studies examining the impact of fragmentation 


and land-use change on population abundance, distri- 
bution, and productivity (Robinson 1988; Hoover 
and Brittingham 1993; Robinson et al. 1993; Hoover 
et al. 1995; Robinson et al. 1995a). Most efforts to 
develop predictive models to assess the location of 
suitable habitat over large areas for Neotropical mi- 
gratory songbirds have given rise to models based on 
patch size and configuration and/or habitat fragmen- 
tation indexes. Although these types of models may 
be a useful first attempt at identifying potential habi- 
tat, they do not provide information about which 
habitat features within large forested landscapes are 
important predictors of Neotropical migratory song- 
bird occurrence. Therefore, these models may not be 
usefully applied to large areas of contiguous forests. 
On the other hand, models that do incorporate vari- 
ables associated with habitat use are usually based on 
fine-grained microhabitat data that require extensive 
data collection and cannot feasibly be applied to large 
areas. 

The southeastern United States, in particular, Great 
Smoky Mountains National Park, is a species-rich re- 
gion with a complex biota. The park is the largest 
tract of contiguous forest within the southeastern 
United States and the second-largest national park in 
the eastern United States. As part of a network of pro- 
tected areas in the southern Appalachians, the park is 
an important reserve for many Neotropical migratory 
land birds. Surveying the physiological, genetic, and 
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life-history characteristics of each species in such a re- 
gion is a near-impossible task (Martin et al. 1993). 
Models based on variables readily available in a GIS 
may offer a solution to such an intractable problem. 
Therefore, we investigated the utility of using com- 
monly available GIS data to assess several topographic 
indexes and other habitat variables as predictors of 
species presence or absence for wood thrush in Great 
Smoky Mountains National Park. 

Great Smoky Mountains National Park represents 
a unique opportunity to examine eastern forest birds 
in a setting free from recent human disturbance. A rel- 
atively large portion of the park has been undisturbed, 
and most of the areas that have been previously al- 
tered have had sixty-five to one hundred years to re- 
cover. This lack of human disturbance enhances the 
likelihood that topographic indexes correlated with 
vegetation features will be useful predictors of bird 
presence or absence. In disturbed areas, variables such 
as stand age and landscape configuration may be 
more-important predictors of vegetation features, but 
in undisturbed areas, topographic indexes are likely to 
be highly correlated with vegetation. Several re- 
searchers (Austin et al. 1984; Yee and Mitchell 1991) 
have been able to successfully predict vegetation char- 
acteristics based on site conditions (Wiser et al. 1998). 
We hypothesized that for areas primarily undisturbed 
by recent human impacts, topographic variables may 
be useful predictors of wood thrush presence or ab- 
sence, since topographic indexes are likely to act as 
predictors of vegetation features. 


Methods 


Great Smoky Mountains National Park comprises 
205,665 hectares of primarily contiguous forest strad- 
dling the Appalachian Trail along the Tennessee- 
North Carolina border. The park serves as the nucleus 
of a group of protected areas in the southern Ap- 
palachians that includes national forests, federally des- 
ignated wilderness areas, state lands, Tennessee Valley 
Authority reservoirs, and National Park Service lands 
(see Fig. 47.1 in color section). This region consists of 
15.1 million hectares, including more than 2 million 
hectares of public land. More than 70 percent of the 
region is currently forested (SAMAB 1996), providing 


the largest extent of forested landscape in the eastern 
United States. 

Great Smoky Mountains National Park is charac- 
terized by a wide elevation range (575-1,830 meters) 
and complex topography, creating a rich diversity of 
habitat and vegetation types. MacKenzie (1993) used 
Landsat imagery to develop a vegetation map of the 
park based on thirteen major forest types. MacKenzie 
patterned his vegetation scheme after the original clas- 
sification system of Whittaker (1956). The MacKenzie 
vegetation types range from pine and pine oak forests 
at the lowest elevations of the park to northern hard- 
woods and spruce fir at the highest elevations. Seven 
of these vegetation types are regularly used by breed- 
ing wood thrush. In general, these habitats occur 
along a wet-to-dry moisture gradient and include cove 
hardwood, mixed mesic hardwood, tulip poplar, mesic 
oak, xeric oak, pine-oak, and pine. 

The Great Smoky Mountains was established as a 
national park in 1934. At that time, more than half of 
the land had been disturbed, primarily by different 
methods of logging. Other disturbances were caused 
by large-scale fires and clearing of the land for home- 
steads. Current disturbances include extensive loss of 
Fraser fir (Abies fraseri) due to infestation by the ex- 
otic balsam woolly adelgid (Adelges piceae) and de- 
cline of many pine species (Pinus spp.) due to southern 
pine beetle (Dendroctonus frontalis) infestations. The 
hemlock woolly adelgid (Adelges tsugae) is also ex- 
pected to spread to the park within the next decade 
(SAMAB 1996). This exotic pest will likely affect east- 
ern hemlock (Tsuga canadensis) populations and 
therefore impact many of the cove hardwood and 
mixed mesic hardwood forests. Bird species such as 
the wood thrush may be seriously impacted if the park 
experiences major losses of eastern hemlock. More 
than 90 percent of wood thrush nests found in the 
park have been located in eastern hemlocks (Farns- 
worth and Simons 1999). 


Field Data 


We conducted variable circular plot point counts 
(Reynolds et al. 1980) at more than four thousand lo- 
cations throughout our study area during May and 
June of 1996-1999. Due to the large number of ob- 
servers (n = 40), we employed several strategies to 
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minimize observer variability. We trained and tested 
all observers prior to the initiation of fieldwork. We 
also provided a review training session at the midpoint 
of the field season. In addition, we rotated observers 
throughout the different areas of the park so that the 
efforts of a single observer were not restricted to a 
particular area of the park. We conducted counts for a 
ten-minute interval between dawn and 10:15 A.M. and 
only in good weather (no rain or excessive wind). Our 
count protocol is consistent with the majority of the 
recommendations for point count methodology de- 
tailed by Ralph et al. (1995b). 

During each ten-minute count, we recorded the 
number of breeding pairs of each bird species present. 
A single observer collected data at any given point. Be- 
fore each count, observers estimated a 50-meter radius 
circle by spotting landmarks using a laser range finder 
and began the count immediately thereafter. At each 
point, we recorded bird detections in all directions for 
an unlimited radius plot. We mapped the location and 
movement of all individual birds detected in order to 
avoid double counting. 

We established the majority of points systematically 
along trails with some points located on minor roads 
and some points located off-trail along transects. The 
majority of trails in Great Smoky Mountains National 
Park are low impact and do not result in a gap in the 
tree canopy. Comparison of counts from points lo- 
cated on- and off-trail showed no significant differ- 
ences (S. Shriner unpublished data). We spaced points 
250 meters apart to avoid double-counting birds. We 
sited points primarily by pacing and occasionally by 
using a laser range-finder. The meandering nature of 
the trails resulted in an average horizontal distance be- 
tween points of approximately 175 meters. In areas 
where we could discern that the trail was very windy, 
we paced an additional 50 to 500 meters between 
points. We sited a small number of points in Cades 
Cove, a large, open area. We spaced these points 500 
meters apart to account for an increase in detectability 
due to the sparse vegetation. In the event that an indi- 
vidual bird was heard at more than one point, we only 
recorded the bird at the point with the smaller detec- 
tion distance. 

We stratified points throughout the park with re- 
spect to the availability of each MacKenzie vegetation 


type. We used Trimble GeoExplorer II global position- 
ing system (GPS) units to collect the geographic coor- 
dinates of each point. 


Digital Data 


We obtained habitat and topographic variable data 
from a GIS database provided by the Inventory and 
Management Division of Great Smoky Mountains 
National Park. All queries were performed using 
ArcInfo (ESRI 1997) software. The Great Smoky 
Mountains. National Park GIS database contains 
90x90-meter grid data for vegetation type, bedrock 
geology, and disturbance history. Vegetation type in- 
formation is based on the groundtruthed analysis of 
satellite imagery produced by MacKenzie (1993); it in- 
cludes thirteen vegetation types. The bedrock geology 
data include twenty-four classes of bedrock. The dis- 
turbance history data are based on an analysis of park 
records and include five categories of human distur- 
bance: undisturbed, selective cut, light cut, heavy cut, 
and settlement (Pyle 1985). 

Data for elevation and several topographic indexes 
(including topographic convergence index, terrain 
shape index, landform index, topographic complexity, 
relative moisture, and relative slope position) are 
available in the database as 30x30-meter grids. Eleva- 
tion for each point was calculated from a digital eleva- 
tion model (DEM) using an interpolation function 
based on the nearest-neighbor cells. Slope and aspect 
were computed in ArcInfo using the DEM. Aspect was 
transformed into north/south and east/west compo- 
nents using sin(aspect) and cos(aspect), respectively. 
The different topographic variables available in the 
park GIS were developed by several different re- 
searchers to characterize the shape of the landscape 
and local moisture regimes at various spatial scales. 
These indexes are coded in Arc Macro Language 
(AML) and are available as coverages in the park GIS 
database. The topographic convergence index (TCI) is 
an index of potential soil moisture developed by Beven 
and Kirkby (1979) and was developed to simulate 
runoff saturation and infiltration. This index has been 
successfully used in spatial models of vegetation distri- 
bution. The underlying formula is TCI = In[(A/tanB)] 
where A is the surface area of each grid cell providing 
drainage and B is the surface slope of the grid cell. 
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The terrain shape index (TSI) was developed by 
McNab (1989) to characterize the topographical cur- 
vature of the landscape. The TSI distinguishes be- 
tween ridges/exposed areas and coves/protected areas 
by calculating the average difference in elevation be- 
tween the center of a plot and its boundary. The land- 
form index (LFI) was also developed by McNab 
(1993) and is a large-area parameter that describes 
general classes of protection at a site, or in other 
words, cove, slope, or ridge. The LFI is the mean of 
eight different slope gradients (N, NE, E, SE, S, SW, 
W, NW) calculated from the center of a plot to the 
skyline. 

Topographic complexity was computed as the 
Shannon-Weaver Index of Topographic Complexity 
(SWI), an index developed by Miller (1986) to explain 
the distribution of rare and endemic plant species. The 
SWI index is a fine-scale index of the topographic di- 
versity of a 150x150-meter plot. Topographic com- 
plexity is calculated for elevation, slope, and aspect 
and then combined into a single measure (Wilds 
1996). The SWI is calculated as SWI = 
—X(p;*log2(p;), where p = the proportion of area for 
each elevation/slope/aspect category. 

Relative moisture is described by the topographic 
relative moisture index (TRMI) developed by Parker 
(1982) to model vegetation distribution in the western 
United States. The TRMI was developed to describe 
local moisture regimes for areas with diverse topogra- 
phy. It is based on aspect, steepness, topographic posi- 
tion, and curvature. Relative slope is also a measure of 
relative moisture and is the distance from the point to 
the bottom of the slope divided by the total distance 
from the bottom of the slope to the nearest ridge. The 
distance to the bottom of the slope is measured to the 
nearest stream or the nearest topographic concavity 
from the point, perpendicular to contour lines. The 
distance to the top of the slope is measured to the 
nearest ridge or topographic convexity, perpendicular 
to contour lines. 


The Model 


The model is a logistic regression model that returns 
the probability of detecting a wood thrush as a func- 
tion of habitat and topographic variables. Logistic re- 


gression is a statistical technique used to predict the 
probability of an event occurrence (Po) and has been 
used to predict the probability that an organism will 
occur based on the conditions present at a particular 
site (e.g., Margules and Stein 1989; Wiser et al. 1998). 
Probability values are constrained to range between 0 
and 1 and therefore cannot be modeled using linear 
functions. In logistic regression, explanatory variables 
that can range from positive to negative infinity are 
transformed to a probability using a logit link func- 
tion. The logistic regression equation models the logit 
transformation as a linear function of the explanatory 
variables (Christensen 1997). The result is a probabil- 
ity value in the 0 to 1 range. 

We used presence/absence data for wood thrush de- 
rived from the point counts as the dependent variable 
in the model and the eleven habitat and topographic 
variables derived from the GIS as the explanatory 
variables. We ran the model using the PROC LOGIS- 
TIC procedure in SAS (SAS Institute 1990b). We used 
the backward elimination and forward selection pro- 
cedures (threshold p-values = 0.1) in SAS to compare 
different logistic regression models based on the 
eleven variables, the squared values of the topographic 
variables, and interactions between the topographic 
variables. The backward elimination procedure ana- 
lyzes the full model and then removes variables one at 
a time as they fail to meet the specified significance 
level for staying in the model. The forward selection 
procedure begins with an intercept-only model and 
adds variables one at a time based on adjusted chi- 
square statistics. A variable is added to the model if it 
meets the specified significance level for staying in the 
model. 

The topographic variables (but not the squared 
terms) were standardized (mean = 0, variance = 1) 
prior to analysis to aid in the interpretation of the pa- 
rameter estimates. If a squared variable or an interac- 
tion variable was significant (p-value less than or 
equal to 0.1), then the main variable was included in 
the model without regard to its significance level. A 
significant interaction term indicates that one of the 
variables has a modifying effect on the impact of ari- 
other variable. 

We treated the thirteen vegetation types, twenty- 
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three bedrock geology types, and five disturbance 
classes as class variables. We only included type 
variables that were represented by a minimum of 
thirty observations such that the final analysis was 
performed on ten vegetation types and eighteen geol- 
ogy types. All counts with missing data or type vari- 
ables not included in the analysis were deleted from 
the data set, which resulted in 3,743 points available 
for analysis. Each of these data points was randomly 
assigned to either a model development data set 
(n = 1,833) or a validation data set (n = 1,910); only 
the model development data set was used for model 
selection. 

Final model selection was based on concordance 
scores (SAS Institute 1990a,b). Concordance is a 
measure of rank correlation between the predicted 
probability that a wood thrush is present at a site and 
the actual presence/absence of wood thrush at that site 
(Bolger et al. 1997). Concordance is calculated as a 
percentage of all possible pairs of observations where 
one member of the pair represents the presence of a 
wood thrush and the other member of a pair repre- 
sents an absence of a wood thrush. A pair is concor- 
dant if an observation representing an absence of 
wood thrush has a lower predicted event probability 
than does the observation representing the presence of 
wood thrush. A pair is discordant if the observation 
representing an absence of wood thrush has a higher 
predicted event probability than the observation rep- 
resenting the presence of wood thrush. If a pair is nei- 
ther concordant nor discordant, then it is a tie. A 
model with a higher concordance score is more likely 
to correctly classify the presence or absence of wood 
thrush at a particular location. We compared the con- 
cordance scores for the models determined by the 
backward elimination and forward selection proce- 
dures to choose the final model. 


Validation and Model Assessment 


We applied the final model to the validation data set 
to compare the performance of the validation data 
with the model development data. Although model se- 
lection was based on concordance scores, we also 
evaluated the final model according to its ability to 
correctly classify presence and absence for the ob- 


served data. We classified points with a probability 
greater than or equal to 0.3 as present. We chose this 
probability because it represents a balance between 
model sensitivity and specificity (see Dettmers et al., 
Chapter 54). 


The Probability Map 


We used the logistic regression model to develop a 
probability map for Great Smoky Mountains National 
Park. The probability map represents the probability 
of detecting a wood thrush at any location in the Park. 
We created a 90x90-meter grid cell coverage of the 
park by programming the logistic regression model in 
ArcInfo map algebra language. Because the vegetation 
type, bedrock geology, and disturbance history cover- 
ages are only available as 90x90-meter grids, we 
coded the probability map at that same grain. For 
each grid cell, the values of each of the explanatory 
variables were determined by querying the appropri- 
ate GIS coverage. The probability for each grid cell 
was then calculated based on the logistic regression 
equation to create a probability coverage of detecting 
a wood thrush. 


Results and Discussion 


Two of the topographic indexes (SWI and LFI) tested 
were significantly correlated with wood thrush occur- 
rence (Table 47.1). The model with the highest con- 
cordance score includes these two indexes as well as 
elevation, disturbance history, and geology type. Sev- 
eral squared and interaction terms were also signifi- 
cantly associated with wood thrush presence/absence. 
All of the variables included in the model were signifi- 
cant at the p less than 0.05 level. The overall model 
had a relatively high concordance of 77.9 percent and 
correctly classified 83 percent of the observed data 
points. Application of the best-fit model to the valida- 
tion data set resulted in a concordance of 78.9 percent 
and correct classification of 86 percent of the observed 
data. 

The parameter estimates indicate that elevation had 
the strongest predictive power of the variables included 
in the model. This result is consistent with the eleva- 
tion range of the wood thrush, which is primarily re- 
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TABLE 47.1. 


Result of logistic regression model selection for wood thrush presence/absence using model 


selection data. 


Standard iii ia 

Variable@ DF Parameter estimate error Wald Pr 

INTERCPT al -12.3752 19.2466 4.9268 0.0264 
ELEV 1 9.0385 1.9063 22.4803 « 0.0001 
ELEV2 dl < -0.001 « 0.001 43.3424 « 0.0001 
SWI íl -2.8090 0.9866 8.1065 0.0044 
SWI2 1 0.0231 0.0075 9.5938 0.0020 
LFI iL 0.8238 0.3433 Sarees 0.0164 
ELLFI al -1.2637 0:4531 7.7668 0.0053 
GEOL £l — — 40.4717 0.0065 
DISTURB 4 — — 10.3836 0.0344 


Concordance: Concordant = 77.9 percent; Discordant = 21.7; Tied = 0.4 percent; (356,040 pairs) 
aVariable abbreviations are as follows: ELEV = elevation, ELEV2 = elevation squared, SWI = Shannon- 
Wiener index of Topographic Complexity, SWI2 = SWI squared, LFI = Landform Index, ELLFI = elevation * 
LFI, GEOL = bedrock type, DISTURB = disturbance history type. 


stricted to elevations below 1,200 meters while 
elevations in the park range up to 1,800 meters. This 
elevation boundary is evident in the probability map 
(see Fig. 47.2 in color section), which shows low 
probability of detecting a wood thrush in the center 
(highest elevations) of the park. The Shannon-Weaver 
Index of Topographic Complexity (SWI) was nega- 
tively associated with the likelihood of detecting a 
wood thrush and had the next-greatest explanatory 
power. The SWI is a relatively fine-scale (150-meter) 
measure of land form that describes the diversity of el- 
evations, slopes, and aspects around a center point. 
The model also includes a positive association with the 
coarse-scale Landform Index, which is a broad meas- 
ure of site protection on the scale of kilometers. This 
variable indicates that the probability of detecting a 
wood thrush increases with increasing protection. The 
disturbance history and geology type class variables 
were also significantly associated with the probability 
of detecting a wood thrush. Parameter estimates (Table 
47.2) for individual disturbance types were calculated 
relative to the undisturbed type. Negative parameter 
estimates for areas that experienced logging in the past 
indicate these areas may be less likely to be associated 
with wood thrush occurrence than undisturbed areas. 
It is interesting that the vegetation type variable was 


not identified as being significantly correlated with 
wood thrush presence/absence. It is possible that the 
vegetation type variable did not have any explanatory 
power beyond the variation explained by the topo- 
graphic variables. 

The specific results of this model are unlikely to be 
applicable outside of Great Smoky Mountains Na- 
tional Park, which is largely undisturbed. However, 
this research highlights the potential for topographic 
indexes to be useful predictors of songbird occurrence 
and they should be more commonly tested in habitat 
models. In addition to their possible predictive value, 
topographic indexes are attractive because many of 
them can be easily calculated from DEM data, which 
is often readily available for large areas. These vari- 


TABLE 47.2. 


Parameter estimates for the disturbance history type class 


variable. Estimates are relative to the undisturbed type. 
—————— — — NE ON NN 


Type Parameter estimate 
Selective cut -0.0222 
Light commercial cut 1018675 
Heavy cut -0.3603 
Settlement area 0.4152 
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ables are also appealing because they may be useful 
surrogates for vegetation information that might oth- 
erwise have to be collected in the field. 


Conclusion 


In order to develop a management strategy for the 
conservation of wood thrush, it is important to iden- 
tify variables associated with species distribution. 
Traditionally, ecologists have sought to identify fine- 
scale microhabitat features associated with occur- 
rence. Microhabitat data is typically only available 
for small areas and can be quite costly and time in- 
tensive to collect, making large-area assessments dif- 
ficult. On the other hand, the data needed to develop 


models based on topographic variables are often 
readily available in GIS databases and can be applied 
to large areas. 
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Controlling Bias in Biodiversity Data 


David R. B. Stockwell and A. Townsend Peterson 


dvances in information technology are revolution- 

izing the way science is being performed. One no- 
ticeable change is the growth of data mining—finding 
useful information among ad hoc collections of data 
(Weiss and Indurkhya 1997). In the biodiversity realm, 
museum specimen data are seeing increased attention 
(Peterson et al. 1998). Automated methods of analysis 
are being applied to such data via the World Wide Web 
(Stockwell and Peters 1999) for modeling species distri- 
butions. These models relate the occurrence of the 
species with environmental variables in what has been 
termed a phenomenological model (Maurer, Chapter 
9). There is concern with the quality of the biological 
data used in these models, particularly the fact that mu- 
seum data is not based on a comprehensive and random 
survey design. A large component of these concerns can 
be categorized as concerns about bias. 

Randomized and comprehensive biological surveys 
such as the Maryland Biological Stream Survey are suit- 
able for analysis using a range of statistical methods 
(Southerland and Weisberg 1995). However, random- 
ized surveys of large areas are rare. More commonly, 
data are collected in an ad hoc or opportunistic fashion 
and without random sampling. Because these data vio- 
late assumptions of common multivariate statistical 
methods, their use is deprecated as a source of reliable 
results (James and McCulloch 1990). But does this 
mean that the data are flawed and unusable? 


Another potential problem of museum collections 
data is that the spatial resolution of the records is 
rarely more precise than 0.5 kilometer, Although this 
may limit their use for smaller areas, it is adequate for 
use at resolutions greater than 1 kilometer. The scale 
of modeling is largely determined by the scale of the 
environmental variables correlated with the species 
data. As climate variables (temperature and precipita- 
tion) are usually the main factors in these models, and 
the resolution available for these variables is usually 
much greater than 0.5 kilometer (e.g., 2.5 kilometer 
minimum), the resolution of the museum collections 
data is not a practical limitation. 

Another concern with museum collections data is 
that they are composed of records-of-presence of the 
species but no records-of-absence of the species. This 
characteristic can limit the application of chi-squared 
tests and other parametric statistical methods simply 
because they require more than one value. However, it 
has not prevented analysis, as shown by the BIOCLIM 
method, which develops an ecological niche model by 
fitting an environmental envelope at the climatic ex- 
tremes of the data set (Nix 1986). A solution is to gen- 
erate *pseudo-absences," a set of data points selected 
at random and used as absences that then allows para- 
metric statistics to be applied to the data (Stockwell 
and Peters 1999). In the view of this chapter, presence- 
only data is simply the result of an extreme form of 
bias. 


Dd tf 
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Therefore, the common concerns with the deficien- 
cies of museum data do not necessarily mean they are 
unusable in analysis. Additionally, when museum col- 
lections databases are combined, museum specimen 
data represent a vast storehouse of species’ occur- 
rences, placing particular taxa in a complete geo- 
graphic and hence ecological context (Peterson et. al. 
1998). Thus, questions can be answered using these 
data, but inherent bias can undermine confidence in 
the results. These biases include focus of sampling ac- 
tivities in areas or habitats of easy access and concen- 
tration of sampling on taxa that are easily detected, 
captured, or preserved. 

We should remember that problems of bias are also 
not restricted to museum data. Surveys crossing many 
habitats can have different observation rates due to 
the density of the habitat. If estimates of abundance 
are based on these survey results, patterns of abun- 
dance can be distorted by the relative difficulty of sur- 
veying the habitat. Methods of controlling bias are 
thus of interest, not only for analyzing museum data, 
but also because they are potentially applicable to any 
type of data. 

There is sometimes the possibility of modifying the 
survey design to reduce bias, or augmenting records 
with field notes where species were seen but not col- 
lected. In this chapter, however, we are concerned with 
strategies for dealing with bias after data collection. 
Developing analytical methods less affected by bias, or 
methods that can reduce bias prior to analysis, pres- 
ents many advantages. Control of bias allows use of 
data that could not otherwise be used. Hence, flexible 
bias management should be an important feature of 
learning systems in applications for automated model- 
ing, based on data mining. More generally, a better 
understanding of bias in ad hoc data sets, such as 
most biodiversity information, is needed. 

The analysis of biodiversity data we are concerned 
with is the ecological niche modeling of species, which 
quantifies probability of occurrence as a response to 
environmental variables. To determine the effects of 
bias on model performance, we develop models of 
species’ ecological niches using two major environ- 
mental dimensions, temperature and rainfall. We de- 
fine an experimental protocol for incorporating bias 
and then controlling its effects on predictions of 


species distribution. We show the effect of methods of 
bias control on accuracy and predicted distribution 
maps of a well-surveyed species. 

The analysis in this study uses methods incorpo- 
rated into the GARP (genetic algorithm for rule-set 
prediction) modeling system (Stockwell and Noble 
1992; Stockwell 1999; Stockwell and Peters 1999). 
The performance of the methods on biased and unbi- 
ased data is compared with and without methods for 
controlling bias. The primary concern is to demon- - 
strate that the methods do indeed control bias in data 
sets. 


Forms of Bias 


A number of meanings exist for the word bias. Clarifi- 
cation of terminology is essential if workers are to 
communicate effectively in ecology (Morrison and 
Hall, Chapter 2). Unfortunately, the semantics of bias 
has not been treated as extensively in the ecological 
literature. Bias can be interpreted in at least three 
ways: statistical bias, inductive bias, and sampling 
bias. 


Statistical Bias 


The tendency of a statistical parameter, such as a 
mean or variance, to converge on a value not equal to 
its true value (Mendenhall et al. 1981) is referred to as 
statistical bias. For example, if the mean of a popula- 
tion parameter, such as abundance or probability, is 
P1, a measure of that population parameter is biased if 
given a large number of samples, it gives a value P5, 
where P? # P4. 


Inductive Bias 


Algorithms and statistical models are often used to ex- 
plore patterns in data. For example, stepwise elimina- 
tion of variables in linear regression is an iterative al- 
gorithm for exploring possible models with a minimal 
number of variables. This process of developing a 
model through fitting a range of models to data is 
called induction. However, the effectiveness of such 
approaches depends critically on algorithmic design 
choices, such as representation language and perform- 
ance criteria (Forsyth 1981). Little is known about the 
effects of these choices and their relationships to types 
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of data (DeJong and Mooney 1986). One of the most 
difficult problems in practical induction is assessing 
the way the design choices interact with data sets ana- 
lyzed (Turney 1991). In practice, factors are manipu- 
lated experimentally until the algorithm meets set cri- 
teria of adequacy in the application domain. When 
done informally, this practice is known as “tweaking.” 
Because the biases in an algorithm are rarely made ex- 
plicit, researchers struggle with the application of vast 
numbers of subtle biases (Michalski 1983). 


Sampling Bias 


A characteristic of data relative to the population 
from which they are drawn is sampling bias, which 
can cause statistical bias if the set of data is a nonran- 
dom subset of the entire population. The most typical 
example for museum data is data collected from sites 
in a limited area rather than from over the whole area. 
However, sampling bias can also occur where data 
come from a set of representative sites in which some 
of the sites consist of multiple samples, such as collec- 
tions focused in one area but only a few in another. 

Statistical, inductive, and sampling bias can be 
present in any empirical study. Owing to the difficulty 
of obtaining truly random samples and robust meth- 
ods that would be free of these problems, it is possible 
that bias is the greatest impediment to the reliable 
analysis of natural data sets. 


Controlling Bias 


Where random sampling is not possible because data 
have already been collected, an effective a posteriori 
approach to controlling bias is needed. This includes 
preparation of data for the analysis, both through 
selection of controlled sets of data to be modeled 
and through the selection of the independent vari- 
ables used to develop the model. We now examine 
sources of bias more peculiar to biodiversity data and 
explore approaches to controlling them through data 
manipulation. 


Presence-only Bias 


Much of biodiversity data—especially data associated 
with scientific specimens—present an odd situation in 
which the location of specimens collected is recorded 
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(positive data) but absence of species is not. These 
data represent only a subset of possible information: 
points of presence only. The approach for controlling 
this source of bias is to generate pseudo-absences: 
points that provide a contrast in prior probabilities of 
presence to the given data. One method of generating 
at random anywhere the species has not been 
_teconded, colledebagkenonnd. pointes, Iis ust 
used in the GARP system (Stockwell and Peters 1999). 
Alternatively, these points could be drawn from spe- 


cific areas known not to hold the species, based on 
field notes of collectors or the recollection of experi- 
enced field workers. 


Abundance Bias 


Two examples of abundance bias are rarity and 
hotspots. Species can be recorded from very few loca- 
tions or many owing to a number of factors: inherent 
ease of encounter or capture, intrinsic rarity, or diffi- 
culty of capture or preparation. These imbalances can 
be regarded as a form of bias, since frequencies are af- 
fected by observational factors. In hotspots, large 
numbers of specimens may be recorded at some loca- 
tions because a particular collector focused activities 
at particular sites, or because many collectors worked 
particular sites more than others did. The approach 
here is to map the data into a regular grid (rasterizing 
data), allowing only one data point to be selected 
from each grid cell. We then control this source of bias 
by generating data sets for analysis with even distribu- 
tions, using resampling. 


Correlation Bias 


Bias can be introduced through the tendency of sam- 
pling to be more frequent in certain geographic fea- 
tures. Collections along roadsides are a particularly 
frequent example. Another is the tendency of inacces- 
sible habitat types such as aquatic vegetation or mon- 
tane areas to be underrepresented. The approach de- 
veloped here is to identify those variables correlated 
with the form of bias and eliminate them from the set 
of possible predictor variables in the model. Certain 
methods of analysis may be more affected by correla- 
tion bias than others might. 
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Methods 


The environmental data employed for predicting the 
distributions consisted of two climate layers: average 
annual temperature (Fig. 48.1A) and average annual 
precipitation (Fig. 48.1B). These data layers were taken 
from the NOAA-EPA (1992) data set, comprising the 
continental United States at a resolution of 0.5 raster 
grid. In practical modeling applications, many other 
variables may be required to obtain highly accurate pre- 
dictions of species’ distributions. For simplicity, though, 
as the purpose of this study is illustrative rather than 
predictive, only these two variables were used. 

The species modeled was Hylocichla mustelina, the 
wood thrush. The point occurrence data used in this 
study were drawn from the U.S. Breeding Bird Survey 
(BBS) data set (Sauer et al. 1997). This data set 
presently consists of approximately 6x10? records of 
species of bird species sighted in yearly census at sites 
across North America. The surveys are conducted 
along prescribed transects approximately 45 kilome- 
ters in length. The grid cells at 0.5-degree resolution 
containing survey paths are shown in Fig. 48.1D. 


Experimental Protocol 


Our experimental protocol was designed to allow 
comparison of the accuracy of models and distribu- 
tions developed on biased data with and without the 


method of controlling bias. The exact approach de- 
pends on the particular bias. In general, the protocol 


consisted of: 


1. Take a large data set with almost-complete data 
about the distribution of a species. 

2. Sub-sample those data in a controlled way to de- 
velop data with the bias being considered. 

3. Develop models on the biased data with and with- 
out the method of controlling bias. 

4. Evaluate the accuracy of the model by comparison 
with the known distribution. 


Bias in the data used to develop the model should 
lead to inaccuracy in modeling niches and in resulting 
geographic distributions. Using biased data, the con- 
trol method should lead to increased accuracy, com- 
pared with analysis without bias control. 

In this protocol, the comparison data consist of 
all grid cells that have been surveyed by the BBS— 
determined to be approximately 50 percent of the grid 
cells in the continental United States. Data sets for 
analysis are random samples of (1) sites where birds 
have been observed (presences), and (2) sites where 
they have never been observed (absences). Data sets 
for analysis contain 2,500 sampled points. 

We used three modeling methods. The first, logit, 
was a quadratic logistic regression model for predict- 


Figure 48.1. The variables used in this study for developing models: (A) the average annual temperature and (B) average an- 
nual precipitation for the continental United States at a resolution of 0.5 degrees derived from interpolated weather station 
data, (C) a simulated road variable, and (D) a mask consisting of cells containing survey locations. Legend: white = masked. 
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ing binary variables that has been applied to predict- 
ing the probability of occurrence of species (Austin et 
al. 1990). The model was developed to fit probabilities 
to a linear combination of variables where the re- 
sponse could be either linear or unimodal. Examples 
of species’ responses to environment that would be 
well modeled by this method are probability increases 
with increasing precipitation (linear), or particular 
temperature range (unimodal). The probability of oc- 
currence P is 


Y = 1X1 + d2X1X1 + d3X2 + d4X2X2 +... + A2nXnXn 


where x; represents the value of a particular environ- 
mental parameter I and a represents a fitted coefficient. 

The second method, envelope, is an envelope-fitting 
method similar to BIOCLIM (Nix 1986). An envelope 
uses only presence points by fitting an enve- 
lope around the range of the set of points. In contrast, 
the logit model cannot be used for presence-only data 
as it requires both presence and absence points for fit- 
ting. This model is described as 


P = 1 if x1 in(aj,42) A x2 in (a3,44) A... A Xp in (a5, 1,dp) 
0 otherwise 


The parameters a;... a, in a presence-only model 
are the minimal range of the parameters that fits all 
the data between them. The values for each pair of pa- 
rameters are calculated variable by variable. 

The GARP rule-set method is used in most other 
analyses (Stockwell and Noble 1992; Stockwell and 
Peters 1999). This artificial intelligence method devel- 
ops a set of models, including regression and envelope 
models as described above. The models are optimized 
in an iterative fashion by incremental modification 
and testing in an evolutionary algorithm. The result- 
ing set of models is output and available for predic- 
tion. In using the set of models for prediction, the best 
model at each prediction point (grid cell) is selected 
and the result used as the predicted value. This 
method has shown greater robustness in a variety of 
applications due to its capacity to draw from a set of 
possible models. These and other algorithms for 
species distribution modeling are implemented in the 
GARP analysis package (http://biodi.sdsc.edu). 


Results 


The accuracy of the results is quoted as the 95th 
percentile range in the mean accuracy over six trials. 
Using this method of comparison, the actual accuracy 
is expected to fall between this range 95 percent of the 
time. Accuracies are not significantly different at the 
95 percent confidence level if the quoted ranges of the 
accuracy overlap. If the range of accuracies does not 
overlap, we may be confident in asserting that the ex- 
pected accuracy of the results is significantly different. 

The predicted distribution of the experiments 
shown in Figures 48.2 to 48.4 should also be consid- 
ered along with accuracy figures, as they are a more 
sensitive indicator of the results. The panels in the fig- 
ures are referred to in a row-column order, with A in 
the top righthand corner and D in the bottom lefthand 
corner. 


Presence-only Bias 


In this protocol, we examined the bias toward presence- 
only data by comparing the results of different model- 
ing methods. The envelope modeling method used 
only data from sites where the species occurred by fit- 
ting a climatic envelope around the presence data 
points. The envelope model minimizes error of pres- 
ence, using the GARP option *envelope" to produce a 
model capable of predicting presence and absences. 
The presence rule encloses all the data points and the 
absence rule is the negation of that rule, predicting ab- 
sence for all points outside the range of the variables. 
We then added pseudo-absences by adding the 
pseudo-absence *background" information into the 


TABLE 48.1. 


The accuracy of trials using presence-only bias control (see 
Fig. 48.2). 


i —  —— M M 


B. Bias C. Bias control D. Bias control 


Presence and 
background data, 
GARP method 


Presence-only Presence and 
data, background data, 


Envelope method Logit method 


n= n 146 mse 
mean = 0.685 mean = 0.748 mean = 0.778 
s.d. = 0.014 s.d. = 0.010 s.d. = 0.015 
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Figure 48.2. The predicted distributions of Hylocichla mustelina, or wood thrush, using modeling methods that make use 
of presence data only, and presence and absence data: (A) the actual distribution of the wood thrush, (B) the (biased) 
predicted distribution using the bioclimatic envelope method, (C) the (bias controlled) predicted distribution using the 
GARP rule-set method, and (D) the (bias controlled) predicted distribution using the logistic regression method. Legend: 


black = absent, white = present, gray = unpredicted. 


data. Models that minimize error of presences and ab- 
sences are the rule-set method and logistic regression 
(logit). 

The accuracy in each of these experiments is shown 
on Table 48.1. The accuracy of predictions of the 
species’ distribution in the envelope method is be- 
tween 0.67 and 0.69. The accuracy with background 
data and the logit method is between 0.74 and 0.75. 


This is a significant improvement in accuracy. The 
GARP rule-set method gave the highest accuracy (0.77 
to 0.78) of the three methods. 

The comparison of the actual predicted distribu- 
tions is shown in Figure 48.2. The envelope model de- 
veloped using presences only (B) predicts a broader 
distribution than the rule-set model (D) that uses pres- 
ence and absence data. Providing pseudo-absences 


Figure 48.3. The predicted distributions of Hylocichla mustelina, or wood thrush, using different proportions of presence and ab- 
sence data. The actual distribution (A) was sampled according to proportions in A, to produce (biased) predicted distribution B. 
Sampling according to strongly biased proportions in original data produced predicted distribution C. Sampling from A using even 
proportions (bias controlled) produced predicted distribution D. Legend: black = absent, white = present, gray = unpredicted. 
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Figure 48.4. The predicted distribution of Hylocichla mustelina, or wood thrush, from presence and absence data se- 
lected for proximity to simulated roads. The predicted distribution of the bird in A and B uses the logistic regression 
method. In A, the (biased) model was developed using climate and road proximity variables and shows the pattern of the 
roads. in B, the (bias controlled) model was developed without the road variable and shows a more accurate predicted 
distribution. Maps C and D are the predicted distributions of the bird using the GARP (genetic algorithm for rule-set pre- 
diction) model. In C, the (biased) models used the climate and road variables. In D (bias controlled), only climate vari- 
ables were used. The predicted distribution using the GARP model did not show a pattern of roads in either case. Leg- 


end: black = absent, white = present, gray = unpredicted. 


based on a random background data population helps 
to constrain the model and increase accuracy of pre- 
diction of the absence of species. The distribution pre- 
dicted by the logistic model is also shown (C). The dis- 
tribution predicted using this method considerably 
overpredicts the distribution of the species. 


Abundance Bias 


In this protocol, we develop a biased data set by using 
data points with varying proportions of presence and 
absences. For example, a form of hotspot bias in the 
BBS data could be due to sampling of the same area 
over many years. The method of controlling this bias 
is “rasterizing”—mapping the temporal data onto a 
grid and allowing just a single point to represent each 
grid cell. 

Although rasterizing reduced some of the redun- 
dancy in the data, models are still exposed to rarity 
bias. For example, the proportion of cells occupied by 
the species influences the proportions in the data used 
to develop the model. If all data points from the grid 
are used, the biased set has a small proportion of pres- 
ences as compared with absences. We controlled this 


source of data by sampling to an even proportion of 
data points among both presence and absences. 

In the example species, the cells where the species 
were found constitute 44 percent of the total cells. The 
biased data was modeled with 44 percent presences 
and 56 percent background (Fig. 48.3B). The data 
were again biased to a proportion of 5 percent pres- 
ences and 95 percent absences (Fig. 48.3C). The bias 
controlled data sets had a proportion of 50 percent 
presences and 50 percent background (Fig 48.3D). 
The accuracy expected using the rule-set method on 
data sets of 44/56 proportions was between 0.78 and 
0.79 (Table 48.2). The accuracy achieved on datasets 
of 5/95 proportions was between 0.80 and 0.81. The 
expected accuracy using even proportion of data was 
between 0.78 and 0.79. 

The accuracy of the models is similar in each case, 
indicating that the rule-set method is not sensitive to 
the relative proportions of presence and absence data 
used to develop the model. Examination of the pre- 
dicted distributions (Fig. 48.3) shows the most ex- 
treme abundance bias (Fig. 48.3C) predicted less area 
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TABLE 48.2. 


Accuracy of trials of abundance bias control (see Fig. 48.3). 


B. Bias D. Bias control 


Uneven (44/56) 
proportions of 
presences and 
absences, 
GARP method 


C. Bias control 


Uneven (5/95) 
proportions of 


Even proportions of 
presences and 
absences, 

GARP method 


presences and 
absences, 
GARP method 


[m e (5 [m =e [n] = (S) 
mean = 0.787 mean = 0.812 mean = 0.783 
s.d. = 0,015 Sic OLOM2 s.d. = 0.008 


for the distribution of the species relative to other 
models (Fig. 48.3A). 


Correlation Bias 


In this protocol, we develop a biased data set by se- 
lecting the data preferentially with proximity to a 
given set of lines crossing the study area. The bias is 
created by a mask that simulates roads that might 
cross the region (Fig. 48.1C). This protocol simulates 
a collection strategy where the presence or absence of 
specimens is recorded along roadsides. 

The control of this form of bias includes both 
methods and the strategy for inclusion of predictor 
variables. In the evaluation of logit and GARP meth- 
ods, the road variable is first included as a potential 
predictor variable. The method of bias control is omis- 
sion of the biased variable in development of the 
model. The biased data set is then analyzed without 
the simulated road variable. 


TABLE 48.3. 


The accuracy of the biased logit model was be- 
tween 0.67 and 0.68 (Table 48.3). The accuracy of the 
bias controlled model was significantly higher, be- 
tween 0.75 and 0.76. Figure 48.4A clearly shows that 
logit analysis of biased data with the simulated road 
variable present predicts a species distribution con- 
taining patterns of the road distribution. When a lo- 
gistic regression analysis is performed without the 
road variable using the biased data, the pattern of 
roads is not evident (Fig. 48.4B). Thus, removing the 
variable that is correlated with the bias in the data 
helps to control the effects of the bias in the logistic 
regression method. 

When the analysis is performed using the rule-set 
method and the road proximity variable, the accuracy 
is between 0.76 and 0.77, and the resulting predicted 
distribution closely resembles the actual distribution 
of the bird (Fig. 48.4C). A slightly more accurate re- 
sult was achieved without the simulated road variable 
(between 0.79 and 0.79). This demonstrates clearly 
that the rule-set method is capable of predicting the 
correct distribution, even in the presence of the vari- 
able correlated with the bias in the data. 


Discussion 


The above results include the distribution predictions 
and accuracy of models of the wood thrush developed 
with and without bias control methods. The presence- 
only and correlation bias control methods showed im- 
provement in the accuracy of predicted distributions 
at dealing with the respective forms of biased data. 


The accuracy of trials using correlation bias control (see Fig. 48.4). 


A. Bias B. Bias control C. Bias D. Bias control 
Climate and Climate and no Climate and Climate and no 
simulated road simulated road simulated road simulated road 
predictors, predictors, predictors, predictors, 


Logit method 


[n e (S 
mean - 0.675 


, S.d. = 0.005 


Logit method 


I6 
mean = 0.758 
s.d. = 0.008 


GARP method 


GARP method 


[0 eG ae 9 
mean = 0.773 mean = 0.795 
s.d. = 0.018 s.d. = 0.008 
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The use of background data to provide pseudo-ab- 
sences for the GARP rule-set model had greater accu- 
racy than an envelope model developed using only 
presence data (Table 48.1). This demonstrates that 
only having positive information about presence of a 
species and no information about absences of that 
species does not necessarily present an obstacle to ac- 
curate prediction. 

Abundance bias did not show a strong effect on the 
accuracy of models developed using differing propor- 
tions of presence and absence data, even when the 
proportion varies between 50/50 and 5/95. However, 
the effect could be seen on the predicted distributions, 
where the uneven proportions resulted in contraction 
of the predicted distribution (Fig. 48.3C). 

The effect of correlation bias on predicted distribu- 
tions was clearly shown using the simulated road vari- 
able in a logistic regression analysis (Fig. 48.4A). The 
removal of this variable as a potential predictor pro- 
duced accurate distribution patterns on data strongly 
biased with a simulated road mask. The rule-set 
method also produced accurate distributions with or 
without the biasing road proximity variable. 

In general, the bias control methods improve the 
predicted distributions of species, particularly in the 
case of the logistic regression method. The GARP 
method was less susceptible to bias in the data but 
did show quantitative improvements when the bias 
control methods were applied. In the case of logistic 
regression modeling, elimination of the biasing vari- 
able when analyzing correlation bias allowed a dra- 
matic improvement in the accuracy of the predicted 
distribution. These results demonstrate the potential 
of bias control methods to overcome problems posed 
in analysis of museum data. These improvements 
were quantified by accuracy measures and, even more 
clearly, through visual examination of the predicted 
distributions. 

In this study, we have categorized biases into three 
types based on where the bias is present. Abundance 
bias occurs in the proportions of data presented to 
the analytical method. Presence-only bias can be re- 
garded as an extreme form of abundance bias in 
which the presence data are highly abundant. The 
problem of these biases is that bias results in less in- 
formation being effectively present. In the case of 


abundance biases (hotspots and rarity), the samples 
of biased data contain few points providing informa- 
tion about presences. In the case of presence-only 
bias, no information about absences is present. Re- 
sampling to even proportions of presences and ab- 
sences corrects this by putting more information into 
the training data set. 

In presence-only bias and correlation bias, the bias 
in the data sample pushes methods into suboptimal 
solutions. In presence-only bias, the model contains all 
presence points; an optimal solution if only presence 
points were important is to the detriment of accuracy 
by predicting too many absences as presences. In the 
case of correlation bias and the logistic regression al- 
gorithm, the correlation of presence data with the bi- 
asing (road) variable ensures that the dummy variable 
figures highly in the model. The variable is actually ir- 
relevant to the species and its ecology. 

These are cases of inductive bias because the as- 
sumptions of the analytical system force the subopti- 
mal solutions. The removal of the biasing variable 
from the set of predictor variables in susceptible 
analysis methods such as logistic regression helps to 
control correlation bias. The redundancy in the GARP 
rule-set model makes it less susceptible to correlation 
bias, as models in the rule set that incorporate the bi- 
asing variable are not necessarily used in predicting 
the presence or absence of the species. 

The biases described above represent the main bi- 
ases we have encountered in analyzing museum data, 
although others certainly exist. With the exception of 
correlation bias, methods of controlling bias can be in- 
corporated into the standard operation of the model- 
ing system: rasterizing, use of pseudo-absences, and 
resampling in even proportions. That is, the control 
method operates without human intervention and can 
therefore be part of the standard methodology. 

In the case of correlation bias, we need to deter- 
mine which variables are potentially likely to cause in- 
creased error due to correlation with the sampling ef- 
fort of the survey. Roadsides have been mentioned, 
and consequently variables based on proximity to 
roads should never be used. Biasing variables may be 
detectable using jackknife and other resampling ap- 
proaches applied to environmental data sets (Peterson 
and Cohoon 1999). 
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The logistic regression method requires the identi- 
fication and removal of potential biasing variables to 
control bias in the data. The GARP rule-set method 
shows the capacity, in this example, to control corre- 
lation bias without the need to identify biasing pre- 
dictor variables. These results show the great poten- 
tial of the GARP rule-set method for utilizing data 
containing unknown biases, such as museum collec- 
tions data. 


Conclusions 


Managers could use these methods to improve the ac- 


curacy of predicting species distributions when using 
museum data or other ad hoc sampled data. They 
could reduce potential errors due to bias by 


1. Evaluating the potential for bias and looking for 
patterns of bias in their biodiversity data. 

2. Identifying and removing variables that correlate 
with bias from the analysis. 

3. Applying the strategies implemented in the GARP 
modeling system for reducing the effect of bias: 
rasterizing and, where there are no absence obser- 
vations, augmenting the data set with randomly se- 
lected background data. 


CHAPTER 


49 


Modeling Cowbird Occurrences and 
Parasitism Rates: Statistical and 
Individual-based Approaches 


Ann-Marie Shapiro, Steven J. Harper, and James Westervelt 


Es Hood, Texas, a large army installation, actively 
manages significant breeding populations of the 
black-capped vireo (Vireo atricapillus) and the golden- 
cheeked warbler (Dendroica chrysoparia) within its 
boundaries (Weinberg et al. 1998; Jette et al. 1998). 
Both species are susceptible to brood parasitism by the 
brown-headed cowbird (Molothrus ater). Long-term 
increases in cowbird populations and resultant para- 
sitism pressure have been implicated as significant fac- 
tors in the decline of a large number of passerines 
(Robinson and Wilcove 1994; Robinson et al. 1995b). 
On the Fort Hood landscape, the black-capped vireo 
appears to be particularly vulnerable to reduced re- 
productive success from cowbird parasitism (Hayden 
et al. 2000). Low productivity due to cowbird para- 
sitism has been identified as a major reason for the en- 
dangered status of this species (USFWS 1991). As a re- 
sult, an important component of endangered species 
management on Fort Hood is the systematic live trap- 
ping and removal of female cowbirds from the instal- 
lation landscape (Eckrich et al. 1999). In 1998, over 
3,800 female cowbirds were trapped and 159 were 
shot within the breeding season on Fort Hood (Eck- 
rich and Koloszar 1998). In comparison, there were 
fewer than 250 black-capped vireos and only eighty- 
nine territorial male golden-cheeked warblers docu- 
mented on the installation (although these data come 
from intensively studied sites and do not reflect total 


installation populations; Koloszar and Bailey 1998; 
Craft et al. 1998). 

Cowbird control has proved successful and has 
generated a large increase in black-capped vireo nest- 
ing success. For example, over 90 percent of black- 
capped vireo nests were parasitized prior to initiation 
of the control program, while in recent years fewer 
than 15 percent of nests have been parasitized (Eck- 
rich et al. 1999). This comprises fairly strong evidence 
that until recently cowbird parasitism has been a lim- 
iting factor in the reproductive success of this species. 
Without landscape-level changes in factors supporting 
cowbird populations on Fort Hood, control of cow- 
birds through trapping and shooting must be con- 
ducted indefinitely for continued benefits to be real- 
ized. An experiment at Fort Hood showed that 
cessation of the control program resulted in increases 
in cowbird densities, increases in parasitism levels, 
and decreases in the reproductive success of black- 
capped vireos (Cook et al. 1998). Therefore, one goal 
of land managers at Fort Hood is to optimize the 
placement of traps in space and time in order to cap- 
ture the most cowbirds with the least effort. 

Our research team has developed and applied simu- 
lation models to assess landscape- and population- 
level implications of various endangered species man- 
agement efforts, including the cowbird control 
program (Trame et al. 1997, 1998). After efforts to 
model cowbird parasitism using statistical approaches 
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within a dynamic landscape simulation model, the 
Fort Hood Avian Simulation Model (FHASM), failed 
to produce spatially explicit results (Trame et al. 
1997), we developed the individual cowbird behavior 
model (ICBM; Trame et al. 1998; Harper et al. 2001) 
using an individual-based approach. Despite a wealth 
of empirical data related to cowbird parasitism, cow- 
bird control efforts, and reproductive success in the 
two endangered species, and somewhat less knowl- 
edge about the vegetation on Fort Hood, a “classical,” 
statistical approach to modeling cowbird parasitism 
proved unsatisfactory. The simple, rule-based ap- 
proach of individual-based modeling appears to have 
served better for this particular application. 
Individual-based models differ from aggregated, 
state-variable approaches in both philosophy and con- 
tent (Grimm 1999). Most classical approaches in ecol- 
ogy, whether static or dynamic, rely on averaging the 
attributes of a population and projecting the emerging 
characteristics, which may not represent the variabil- 
ity exhibited by the natural system. Model projections 
can be improved, to a certain extent, through the ad- 
dition of greater complexity in model structure and 
the incorporation of greater detail in input data. A rel- 
atively recent, alternative approach is to capture the 
different attributes and behavior of each individual or- 
ganism in question, keep track of individual variation 
throughout a simulation, and eventually aggregate the 
resulting patterns at a higher level of organization for 
analysis and interpretation. The individual-based 
modeling approach supports the use of fairly simple 
rule sets while creating aggregated results that reflect 
higher levels of complexity (Huston et al. 1988). The 
challenge of keeping track of individual attributes of 
many organisms in a single simulation is no longer a 
limitation, since greater computational power has be- 
come readily available. The development of hundreds 
of individual-based models (Grimm 1999) supports 
the assertion that the individual organism is the logical 
unit for modeling certain ecological questions (Judson 
1994). As this approach gains acceptance, new simu- 
lation environments provide the tools needed to de- 
velop individual-based ecological models. Swarm 
(Minar et al. 1996), Echo (Jones and Forrest 1993, 
http://www.santafe.edu/projects/echo/echo.html), 
Gecko, a simujator developed within Swarm (Booth 


1999), the Model of Animal Behavior (MOAB; 
http://www.usgs.gov/tech-transfer/factsheets/FS-056-9 
7.html), XRaptor (Bruns et al. 1999), and even the ed- 
ucational tool, Ecobeaker (http://www.ecobeaker.com/ 
index.html), all provide development environments 
for individual-based ecological models. Topping et al. 
(1998) have presented a new “biological program- 
ming language” to facilitate programming of individ- 
ual-based ecological models. 

This chapter will compare the development and 
content of two models of cowbird parasitism. First, 
we list the data sources available throughout both 
modeling efforts. Then, we briefly sketch the approach 
and content of FHASM, our statistical landscape 
model, focusing on the portion of the model that ad- 
dressed cowbird parasitism. Then, we detail the ap- 
proach and content of the ICBM, our individual-based 
cowbird behavior model. Last, we compare the ability 
of the two models to predict brown-headed cowbird 
occurrences and parasitism, and briefly discuss the im- 
portance of scale in studying and modeling brown- 
headed cowbird behavior. 


Data Sources 


During model development, we had access to the re- 
sults of years of field research and management efforts 
on Fort Hood. Nest-specific parasitism and reproduc- 
tive success measures were available from 1994 to 
1996, trapping results were available for 1993-1996, 
trap locations were available for 1995 and 1996, and 
an avian community study documented the distribu- 
tion and abundance of potential cowbird hosts within 
warbler breeding habitat in 1994. In addition, sum- 
marized data on nesting success and trapping efforts 
were available for all years since 1987. A 1987 vegeta- 
tion map was available (Trame et al. 1997), but confi- 
dence in the map was very low and we did not feel 
comfortable using it to develop predictive relation- 
ships between vegetation and cowbird or host param- 
eters. An extensive land-cover trends analysis (LCTA; 
Diersing et al. 1992) database was provided for 1989 
through 1995 and contained data from randomly 
placed transects, including vegetation, vertebrates, and 
land uses across the entire installation. A GIS database 
was also available from various efforts on the installa- 
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tion since the mid-1980s. When we initiated the indi- 
vidual-based approach, it became obvious that we 
would need to simulate the behavior of cattle herds, 
and we sought additional data to support that require- 
ment. We were provided with the locations of supple- 
mental feeding stations and salting areas used in cattle 
herd management on Fort Hood, and we attained 
1996 Landsat imagery of the installation. Two vegeta- 
tion categories, grassland and non-grassland, were de- 
termined from the imagery using neural network clas- 
sification. The results of two neural network 
calculations were combined to create a 50-meter- 
resolution vegetation map. Only cells classified as 
grassland in both calculations were accepted as grass- 
land cells. The neural network was trained with 152 
LCTA points from 1995. Settings for the two calcula- 
tions were percent trained: 0.75, 0.75; learn: 0.20, 
0.60; momentum: 0.90, 0.50; epochs: 100, 100; and 
iterations: 100, 1,000. Error for the two calculations 
were 0.0979 and 0.0094. 


The Statistical Approach 


Our initial modeling effort produced the Fort Hood 
Avian Simulation Model (FHASM), a dynamic, spa- 
tially explicit model of ecosystem processes and the 
population dynamics of the black-capped vireo and 
the golden-cheeked warbler (Trame et al. 1997). Our 
first effort to simulate cowbird parasitism was embed- 
ded within the development of this complex model. 
The model simulated changes in vegetation and in 
golden-cheeked warbler and black-capped vireo popu- 
lations across the installation landscape over 100-year 
periods. We designated management policies, or al- 
tered a variety of input variables; FHASM generated 
maps of vegetation, fires, management activities, and 
bird populations for each year of the simulation. The 
relative results of different management scenarios 
were compared, something that could not be assessed 
easily nor be accomplished quickly through field 
studies. i 
FHASM was created using the Spatial Modeling 
Environment (SME) software, developed at the Uni- 
versity of Maryland (http://kabir.cbl.umces.edu/ 
SME3/index.html), GRASS GIS software (http://www. 
baylor.edu/~grass/), and STELLA modeling software 


(High Performance Systems 1994). The general model 
of FHASM was constructed using the graphical, user- 
friendly STELLA program on a desktop computer. 
The propagation of the general FHASM model across 
a grid landscape was accomplished by the SME soft- 
ware, running on a single Unix workstation. 

The landscape of Fort Hood was divided into 
21,540 square raster cells, each representing a 4- 
hectare area (200x200 meters). This cell size provided 
a reasonable representation of the average breeding 
territory for either endangered species on Fort Hood 
(Weinberg et al. 1996; Jette et al. 1998). FHASM pro- 
ceeded in a three-month timestep, allowing the use of 
summary statistics that described recruitment by 
breeding birds over an entire breeding season (quarter 
2). FHASM was composed of five submodels— 
habitat, avian, management, and two submodels that 
managed spatial data. The habitat submodel simu- 
lated successional changes among fifteen mapped 
plant communities. Plant community type then was 
used by the avian submodel, which determined the 
quality of the vegetation as breeding habitat for vireos 
and warblers. Territory choices were made based on 
habitat characteristics and the history of occupation 
of a site. The management submodel simulated the ac- 
tivities of endangered species managers to promote 
and/ or protect endangered species habitat (affecting 
the habitat submodel), and to reduce the risk of cow- 
bird parasitism (affecting the reproductive parameters 
in the avian submodel). The avian submodel calcu- 
lated reproductive success from habitat quality and 
the probability of parasitism by cowbirds. The map 
input submodel informed SME of the proper GRASS 
input maps from which to attain values for mapped 
variables. Finally, the simulation submodel stored spa- 
tial variables (in the form of GRASS maps) from each 
time step and made them available to other sectors of 
the model, as needed. 

We modeled cowbird parasitism in FHASM based 
on a strong empirical relationship between cowbird 
control efforts and installation-wide parasitism rates. 
As cowbird control efforts (days trapping or days 
shooting), and numbers of females killed, increased 
from 1988 through 1995, installation-wide parasitism 
rates dropped dramatically from 90.91 percent in 
1987 to a low of 12.59 percent in 1994 and 15.17 


550 PREDICTING SPECIES OCCURRENCES 


Probability of Parasitism - FHASM Approach 


Probability of Parasitism (2) 


100 


Total Coubirds Trapped 


Total Cowbirds Shot 


3000 


Figure 49.1. The empirical relationship between total numbers of female brown-headed cow- 
birds (Molothrus ater) trapped and shot on Fort Hood and installation-wide nest-specific proba- 


bilities of parasitism. 


percent in 1995 (Hayden and Tazik 1991; Bolsinger 
and Hayden 1992, 1994; Tazik and Cornelius 1993; 
Weinberg et al. 1995, 1996; Fig. 49.1). The same data 
sources revealed a coarse spatial pattern based on ac- 
cessibility to different areas on the installation. Due to 
the nature of the military mission, there were areas on 
post that were inaccessible to regular civilian access, 
and thus no consistent cowbird control was possible. 
For these inaccessible areas, the number of parasitized 
vireo nests was strongly related (adjusted R2 = 
0.94839) to the number of females captured by trap- 
ping (F = 129.6201, df = 1, P < 0.0000). In accessible 
areas, percentage of parasitized vireo nests was 
strongly related (adjusted R2 = 0.980507) to the num- 
ber of females killed by shooting (F = 59.5913, df = 1, 
P < 0.0015), the number of females captured by trap- 
ping (F = 166.2290, df = 1, P « 0.0002), and their in- 
teraction (F 09 16:8250, df = 1, P < 0.0148). We 


searched for similar relationships at a finer spatial 
scale, across the five historically recognized subregions 
of the 87,000-hectare installation, without success. Al- 
though parasitism was influenced by total number of 
females captured, the relationship did not show a spa- 
tial effect. This suggested that differences in parasitism 
rates were due to total trapping efforts across the en- 
tire installation. 

Based on these analyses, the risk of cowbird para- 
sitism was modeled as a function of control effort and 
the efficiency (success rate) of the control effort (Table 
49.1; Fig. 49.2). Effort was defined as the number of 
trap days or shooting excursions for the second quar- 
ter of each simulation year (the breeding season). Eff- 
ciency was defined as the number of females trapped 
per trap day or the number of females shot per shoot- 
ing excursion. Since there were differences among re- 
gions in the efficiency of traps, trapping effort and ef- 


TABLE 49.1. 
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Input data sources used for the brown-headed cowbird (Molothrus ater) control section of Fort Hood Avian Simulation Model 


(FHASM). 


Variable name 


Content 


M TRAPDAYS LF 


M TRAPDAYS EARA 


M TRAPDAYS WERA 


M TRAPDAYS WEFH 


M TRAPDAYS CANT 


M TRAP EFFIC LF 


M TRAP EFFIC EARA 


M TRAP EFFIC WERA 


M TRAP EFFIC WEFH 


M TRAP EFFIC CANT 


M SHOOTDAYS 


M SHOOT EFFIC 


Sum of number of traps open by number of days open 
within the region of Fort Hood known as the 

"Live fire area" 

Sum of number of traps open by number of days open 
within the region of Fort Hood known as the 

"eastern ranges" 

Sum of number of traps open by number of days 

open within the region of Fort Hood known as the 
"western ranges" 

Sum of number of traps open by number of days open 
within the region of Fort Hood known as 

"West Fort Hood" 

Sum of number of traps open by number of days open 
within the region of Fort Hood known as the 
"cantonment" 

Mean number of females trapped per trapday within 
the region of Fort Hood known as the "life fire area" 
Mean number of females trapped per trapday within the 
region of Fort Hood known as the "eastern ranges" 
Mean number of females trapped per trapday within the 
region of Fort Hood known as the "western ranges" 
Mean number of females trapped per trapday within 
the region of Fort Hood known as "West Fort Hood" 
Mean number of females trapped per trapday within the 
region of Fort Hood known as the “cantonment” 

Sum of number of days in which a cowbird technician 
patrols endangered species breeding habitat and 
shoots individual female cowbirds 

Mean number of female cowbirds killed per 

shooting day 


Mr. Gil Eckrich in 1996 


Data source Value used? 
1995 data, provided by 0 
_Mr. Gil Eckrich in 1996 

1995 data, provided by 1890 
Mr. Gil Eckrich in 1996 

1995 data, provided by 2074 
Mr. Gil Eckrich in 1996 

1995 data, provided by 0 
Mr. Gil Eckrich in 1996 

1995 data, provided by 153 
Mr. Gil Eckrich in 1996 

1995 data, provided by 0.0664 
Mr. Gil Eckrich in 1996 

1995 data, provided by 0.4894 
Mr. Gil Eckrich in 1996 

1995 data, provided by 0.8120 
Mr. Gil Eckrich in 1996 

1995 data, provided by 0.0614 
Mr. Gil Eckrich in 1996 

1995 data, provided by 1.3660 
Mr. Gil Eckrich in 1996 

Estimated by Mr. Gil Eckrich 27 
to represent shooting 

effort in 1995 

1989 data, provided by 1.5682 


aCan be altered by the user. 


ficiency were input at the regional scale, while effort 
and efficiency for shooting was modeled at the scale of 
the entire installation. The products of effort and effi- 
ciency across the installation determined a single rate 
of parasitism for the entire landscape. 

As a result, the trapping of cowbirds affected simu- 
lated parasitism on an installation-wide scale (i.e., the 
effects of trapping were experienced by nests uni- 
formly across the installation, rather than in the local 
area surrounding a trap). Although this uniformity 
was not believed to be accurate, it was the only result 


successfully developed using statistical approaches (see 
Discussion for explanation of other analyses we 
attempted). 


The Rule-Based, Individual Approach 


In response to the limited success of modeling cowbird 
parasitism in FHASM, we developed a second model 
to simulate cowbird behavior and trapping efforts on 
Fort Hood. The individual cowbird behavior model 
(ICBM) was a two-stage individual-based model that 
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COWBIRD CONTROL 


G BCV STUDY SITES 
M PCT MONITORED 
QUARTER 


M FEM CB TRAP. 


M TRAPDAYS LF M TRAP EFFIC LF 


M TRAPDAYS EARA M TRAP EFFIC EARA 


M TRAPDAYS WERA M TRAP EFFIC WERA 


M TRAPDAYS WEFH M TRAP EFFIC WEFH 


M FEM CB SHOOT M PCT PARA 


M TRAPDAYS CANT M TRAP EFFIC CANT 


MFEM CB TRAP G TRNAREA TYPE 


M SHOOTDAYS M SHOOT EFFIC QUARTER 
M FEM CB SHOOT 


Figure 49.2. STELLA diagram of the Cowbird Control subsec- 
tion of FHASM. Explanations of input variables are found in 
Table 49.1. 


simulated female cowbird movement in response to 
breeding habitat quality, the locations of feeding cat- 
tle, and movement decision rules (Table 49.2). The 
first stage of the model produced maps of cumulative 
species occurrences through time for both individual 
female cowbirds and for cattle herds. If desired, out- 
put could be analyzed within the second stage of the 
ICBM to ascertain optimal trapping strategies for en- 
dangered species management (see Harper et al. 2001 
for a discussion of the trapping component of the 
model and application of the two-stage model). Oth- 
erwise, output from the first stage could be used di- 
rectly for applications such as risk-analysis of endan- 
gered species breeding habitat. This chapter focuses 
on the technical development of the first stage of the 
ICBM. 

The ICBM was developed using GRASS GIS 
(http://www.baylor.edu/~grass/) and Swarm (Minar et 
al. 1996), an individual (agent-based) simulation envi- 


TABLE 49.2. 


ronment available from the Santa Fe Institute 
(http://www.santafe.edu/projects/swarm/swarmdoc/ 
swarmdoc.html). Westervelt (2001) has described the 
integration of the two modeling tools for development 
of the ICBM. Swarm employed an object-oriented ap- 
proach to simulate independent entities (agents) inter- 
acting via discrete events controlled by an activity 
schedule. The ICBM was composed of feeding area 
agents, (the equivalent of cells on a grid landscape), 
agents representing locations known to influence 
cattle behavior (such as persistent water sources and 
supplemental cattle feeding stations), and agents rep- 
resenting the organisms of interest (i.e., cattle herds 
and female cowbirds). GIS was required to assess spa- 
tial features of vegetation, such as distance to edge, 
and to prepare spatially explicit data for input to the 
simulation. The dynamic model simulated movement 
behavior of both female cowbirds and herds of free- 
ranging cattle on Fort Hood. Telemetry studies of fe- 
male cowbirds on Fort Hood during the breeding sea- 
sons of 1995 and 1996 revealed that over 90 percent 
of the afternoon sightings of feeding cowbirds were of 
birds in the presence of at least thirty cattle (Cook et 
al. 1997). Early experiences with cowbird control 
demonstrated that trapping is most successful when 
efforts targeted feeding cowbirds in the afternoon 
hours (Hayden et al. 2000). As a result, simulated cat- 
tle herd movement was an important component of 
the ICBM. The landscape of Fort Hood was divided 
into a grid with a resolution of 750x750 meters, cor- 
responding to a cell area of 56.25 hectares. This scale 
was a reasonable match to the median (50 hectares) 


Input data sources used for the first stage of the individual cowbird behavior model (ICBM). 


Data source Resolution Year Other 

Landsat TM imagery 30 meter 1996 EOSAT 
Seven bands 

Corral Locations Point data 1998 Trame et al. 1999 

Warbler and vireo breeding 50 meter 199128 Probably hand digitized from 

habitat raster map 1:50,000-scale map 

Number of cattle herds N/A 1996 Equal to the legal permit for number 
of Animal Units 

Water map 50 meter 19908 Originally from -a 1:75,000-scale map 


aRaster map completed. 
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and mean (65 hectares) size of female cowbird breed- 
ing territories from 1995 telemetry studies on Fort 
Hood (Cook et al. 1995), although mean breeding ter- 
ritory size in 1996 was 99 hectares, (Cook et al. 
1997). The UTM (universal transverse mercator) co- 
ordinate for each agent within the Swarm simulation 
was recorded at each timestep and was *known" to 
the other agents. 

The dynamic model ran with a daily timestep for 
ninety-day simulations, corresponding to a single 
breeding season of the endangered species on the in- 
stallation. Output reflected cumulative movement de- 
cisions over this length of time. We assumed that veg- 
etation characteristics of habitat did not change 
significantly during a single breeding season and that 
female cowbirds maintained a single breeding terri- 
tory, equivalent to one grid cell in the model, through- 
out a single breeding season. Thus, our primary goal 
was to simulate the daily movements of cowbirds as 
they traveled from their breeding territories to new 
feeding sites each day. 


Modeling Cowbird Locations 
and Parasitism 


Breeding-habitat quality for female cowbirds, and 
feeding habitat for both cowbirds and cattle, were 
evaluated using GRASS. Using a neural network, we 
classified 1996 Landsat TM imagery into grassland 
and non-grassland categories at a resolution of 50 me- 
ters. We then calculated the quantity and continuity of 
grassland within grid cells at the resolution of the 
ICBM, 750 meters (details can be found in Trame et 
al. 1998). 


Breeding Habitat 


Using GRASS, we determined the amount of non- 
grassland habitat within delineated endangered species 
areas on Fort Hood, and we found that endangered 
species habitat was characterized by at least 30 per- 
cent non-grassland cover. Since we intended the model 
to focus on parasitism of the black-capped vireo and 
golden-cheeked warbler in particular, we initialized 
the ICBM with cowbirds in grid cells of at least 30- 
percent non-grassland vegetation. 

Decreasing mean distances to grassland edge within 


each grid cell contributed to higher breeding habitat 
quality. Three levels of breeding habitat quality were 
recognized. Grid cells containing a mean distance be- 
tween grassland and non-grassland areas of less than 
or equal to 100 meters were considered highest quality 
based on the results of Brittingham and Temple 
(1983). Grid cells containing a mean distance between 
grassland and non-grassland areas greater than 300 
meters were considered lowest quality (again, based 
on Brittingham and Temple 1983), while intermediate 
mean distance values were considered of intermediate 
quality. Using GRASS, female cowbirds were allocated 
through a weighted random algorithm. The three 
quality levels were weighted in the following way: 
low-quality cells = 1.0; intermediate-quality cells = 
2.3, and the high-quality cells = 3.6. Weights were 
based on Brittingham and Temple’s (1983) observa- 
tion that the proportion of nests parasitized in edge 
habitats is 3.6 times higher than the proportion of 
nests parasitized in nonedge habitats (65 percent ver- 
sus 18 percent), while the intermediate weight of 2.3 
was the mean of the extremes. 

The actual number of breeding cowbirds on Fort 
Hood has not been determined. The ICBM was de- 
signed to be initialized by placing a single female cow- 
bird into any grid cell with potential breeding habitat 
characteristics. The proportion of breeding habitat 
utilized by cowbirds could vary from 1 to 100 percent 
breeding habitat “saturation.” For the simulations il- 
lustrated here, breeding capacity was equal to 50 per- 
cent, for a total of 624 females in each simulation. 


Cattle-grazing Habitat 


Female cowbirds on Fort Hood were documented on 
their breeding territories during the morning hours 
and feeding or resting in the presence of domestic cat- 
tle herds during the afternoon (Cook et al. 1998). To 
model cowbird movement between breeding and feed- 
ing habitats, it was necessary to determine the location 
of cattle herds (thirty or more individuals) for the ma- 
jority of time in the afternoon period (grazing, resting, 
and drinking combined). 

Grazing-habitat quality was determined using 
GRASS. The amount and continuity of grasslands, as 
well as the distance to water and the distance to 
supplemental feed or salt equally contributed to the 
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grazing-habitat quality of each grid cell. Habitat qual- 
ity was represented with a value from 0 to 1 for each 
of these factors, and their product yielded the measure 
of grazing habitat quality. 

Grazing quality increased as proportion of grass- 
land increased to 50 percent, at which point quality 
reached a maximal plateau. Greater habitat-quality 
values were assigned to areas of contiguous grassland 
compared with areas containing numerous disjunct 
patches of grassland (details on the mathematical cal- 
culations for habitat quality are found in Trame et al. 
1998). 

Research on range management has documented a 
solid relationship between abiotic factors, including 
the location of water, and cattle movement over large 
areas (Senft et al. 1987; Coughenour 1991). Valentine 
(1947) recommended that stocking rates be adjusted 
for distance to water, since vegetation far from water 
was underutilized compared to vegetation near water. 
The influence of distance to water on cattle grazing 
also was found to be highly significant by numerous 
other researchers (Cook 1966; Senft et al. 1983; 
Stafford Smith 1988). GRASS layers showing sources 
of water expected to be available during the avian 
breeding season on Fort Hood were available through 
installation GIS databases. Simulated cattle remained 
close to water sources, since grazing quality dropped 
as much as 50 percent at distances of only 1,500 me- 
ters (Trame et al. 1998). 

No quantitative information was available describ- 
ing cattle foraging response to locations of supplemen- 
tal feeding, although it has been shown to be signifi- 
cant (McDougald et al. 1989). Personnel at Fort Hood 
believed a majority of cattle remained within 1.5 to 2 
kilometers of supplemental feeding stations most of 
the time (T. L. Cook, Director of the Fort Hood Pro- 
ject of the Texas Nature Conservancy, Fort Hood, 
Texas; A-M Shapiro and S. J. Harper personal com- 
munication 1996; J. D. Cornelius, Branch Chief, En- 
dangered Species Branch, Fort Hood, Texas). How- 
ever, field work in 1998 frequently documented cattle 
herds in locations greater than 2 kilometers from feed- 
ing stations, and the model was altered to better 
match the new data (Trame et al. 1999). More- 
accurate locations of supplemental feeding stations 
and salt blocks also were recorded during 1998 and 


entered into a GRASS layer to improve model 
accuracy. 


Cattle Movement 


Simulated cattle movement was largely based on day- 
to-day movement patterns recorded by Bailey et al. 
(1990) in a 248-hectare Texas pasture of fairly ho- 
mogenous forage. Their experimental pasture was ini- 
tially divided into sixty-three unfenced units. How- 
ever, this was too fine-scale to capture actual cattle 
movement patterns, and the final analysis divided the 
pasture into five large sections to calculate transition 
probabilities among different areas on a daily time 
scale. The average area of the five sections was 49.6 
hectares, approximated in the ICBM with grid-cell 
sizes of 56.25 hectares. Bailey et al. (1990) concluded 
that cattle display a *win-switch" strategy when for- 
aging. Rather than stay in a productive area until the 
value drops, as predicted by optimal foraging theory 
(Charnov 1976), they switch grazing areas before for- 
age quality drops. In addition, cattle may utilize spa- 
tial memory (Bailey 1995, Bailey et al. 1996) to avoid 
recently grazed areas. Behavioral research has shown 
that cattle recall locations of food depletion for at 
least eight days (Bailey et al. 1996). Süch mechanisms 
may be important in homogeneous environments; 
studies in heterogeneous landscapes may show differ- 
ent patterns arising from alternative mechanisms (Bai- 
ley et al. 1990). In the ICBM, we assumed that forage 
quality was relatively homogeneous and cattle move- 
ment of Fort Hood was comparable to that recorded 
by Bailey et al. (1990). 

Cattle were modeled as small herds, with each 
agent representing thirty individuals. Supplemental 
feeding and salting locations served as *home base" to 
the herds of cattle and will be referred to by that 
name. Grid cells with GIS-generated habitat character- 
istics (described earlier) were referred to as *feeding 
areas" in the Swarm model, since that is the primary 
function for which they were evaluated by both the 
cowbirds and the cattle herds (but note that many of 
these “feeding areas" actually served as breeding terri- 
tories for cowbirds). To approximate the current graz- 
ing capacity on Fort Hood, four cow herd agents were 
placed into each home base upon initialization of 
Swarm. Each cow herd agent then identified the feed- 
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ing areas within a fixed distance (3.75 kilometers) 
from its home base and created cow-memory objects 
for each. During simulations, cow herds only consid- 
ered moving to these nearby feeding areas to minimize 
computational overhead. 

During simulation, each cow herd agent chose a 
feeding area in which to feed at the beginning of each 
timestep. The choice was based on an attractiveness 
value composed of three factors: (1) its grazing qual- 
ity, (2) time since first occupied by the herd, and (3) 
distance from the herd's current location. 

When a cow herd first visited a feeding area, both 
the cow herd and the feeding area agents recorded the 
move. The time since first occupied was then factored 
in during subsequent movement decisions. The value 
of the visited feeding area decreased daily through the 
fifth day following initial visitation. By that day, the 
probability of re-visiting dropped to zero. The value of 
the feeding area then increased over the following 
eight days. No reduction in value was created due to 
recent occupation by a different herd, since avoidance 
was thought to be based on memory of the cattle's 
own occupation, and not environmental cues (Bailey 
et al. 1996). The distance from a herd's current loca- 
tion was factored into subsequent moves as well. At- 
tractiveness was not reduced for the feeding area cur- 
rently occupied, nor for immediately adjacent feeding 
areas, but the value declined in a linear fashion for 
feeding areas at distances of two, three, and four cells. 
Beyond distances of four cells, the attractiveness of 
feeding areas dropped to zero. Each of these three at- 
tractiveness factors (grazing quality, distance from the 
previous herd location, and time since first occupied) 
was represented by values ranging from 0 to 1, which 
were multiplied together with equal weighting to cal- 
culate the overall attractiveness value. Once the graz- 
ing quality of each feeding area was updated in each 
new time step, the movement rule instructed each cow 
herd to move to the feeding area with the highest graz- 
ing quality. Percentage of cattle herds moving zero, 
one, two, and three or more cells distance is output 
for each time step during simulation. 


Cowbird Movement 


Female cowbird agents were placed into breeding ter- 
ritories upon Swarm initialization, as described above. 


To follow movements of those cowbirds most likely to 
parasitize endangered species, we labeled each cow- 
bird agent as breeding in black-capped vireo protec- 
tion areas, golden-cheeked warbler protection areas, 
or locations not within protection areas. These labels 
were available in model output of cowbird behavior 
patterns. 

Feeding cowbirds at Fort Hood nearly always as- 
sociate with small herds of cattle, so movement rules 
in ICBM rely on cowbird agents *perceiving" the lo- 
cations of cattle herd agents, within limits. We know 
of no field research describing how cowbirds locate 
feeding sites with cattle, areas that can be more than 
18 kilometers from their breeding territories (Cook et 
al. 1998). We developed four behavioral algorithms, 
or movement rules, to simulate the daily search for a 
suitable feeding area after leaving the breeding terri- 
tory, and compared the daily movement distances re- 
sulting from each rule with daily movement data 
from telemetry on Fort Hood. The best-performing 
rule (for comparisons see Harper et al. 2001) directed 
each cowbird agent to move, in reverse chronological 
sequence, directly to each of the three most recently 
utilized feeding areas until a cattle herd was located. 
If no cattle herd was found at any of these previously 
successful locations, the cowbird would move ran- 
domly to the nearest feeding area, then to the next 
nearest feeding area, and so forth, until a cattle herd 
was located. Along the search path, the cowbird had 
the ability to assess feeding areas en route to the three 
previously successful locations. We defined the per- 
ception distance (i.e., distance away from the 
straight-line travel path) within which cowbirds could 
locate cattle herds to be 1,000 meters while in flight, 
a conservative distance considering the open land- 
scape of Fort Hood grasslands. The decision to feed 
in a feeding area was registered by the feeding area 
agent and the summed counts were used to produce 
output of cumulative occurrences in each feeding area 
on the landscape. 

Additionally, the status of each female cowbird, 
whether trapped or not yet trapped, was recorded for 
each time step by the feeding area agent serving as the 
breeding territory, and the summed counts provided 
the total number of days that a simulated cowbird 
occupied her breeding territory. From this output, 
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Probability of Parasitism - ICBM Approach 
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Figure 49.3. An example of how the probability of parasitism 
within a single ICBM (individual cowbird behavior model) grid 
cell could be based on the length of occupation of the grid cell 
by a breeding female brown-headed cowbird (Molothrus ater). 
This example is not based on empirical data, but on profes- 
sional judgment. Figure was generated using spreadsheet soft- 
ware Microsoft Excel. 


spatially explicit estimates of cowbird parasitism 
probabilities can be calculated by assigning probabil- 
ity of parasitism according to how many days a female 
occupied a territory (Fig. 49.3). 


Discussion 


Altogether, three separate analytical efforts were made 
to predict cowbird occurrences and/or parasitism 
rates. Two attempts at modeling spatially explicit par- 
asitism rates failed. Due to the rich data sources we 
had available, and to our inability to develop spatially 
explicit predictions of cowbird occurrences, we feel it 
is worthwhile to explain the two analyses attempted 
and rejected during model development. First, we ex- 
amined available data on northern cardinal (Cardi- 
nalis cardinalis) densities, preferred habitat of cardi- 
nals, and nest parasitism of black-capped vireos, 
based on Barber's (1993) findings that numbers of 
nearby cardinals are strongly correlated to high para- 
sitism in the vireo. This approach failed because the 
value needed for our model was a probability for par- 
asitism, whereas the categorical nature of individual 
nest information (parasitized versus not parasitized) 
was inappropriate. We attempted to aggregate individ- 
ual nests into groups for calculations of probability 
for parasitism and then to subsequently compare 


those groups to the independent variables of interest. 
(It is important to realize that determining probability 
of parasitism was essential; we modeled the impact of 
parasitism on black-capped vireo fecundity with Pease 
and Grzybowski's [1995] model, which required prob- 
ability of parasitism as an input.) 

We also attempted to generate parasitism probabil- 
ities based on distance to traps, amount of nearby dis- 
turbed habitat, and categorical (parasitized versus not 
parasitized) nest data. For each grid cell in the model, 
the numbers of parasitized vireo nests and total nests 
in a 5-kilometer radius were used to calculate the per- 
cent parasitism in the neighborhood. The weighted 
counts were generated by a circular matrix filter, then 
summed, and the neighborhood value for percent par- 
asitism was assigned to the centroid (focal) cell. The 
same approach was used to calculate weighted counts 
of nearby disturbed cells and weighted counts of 
nearby traps. Our regression analyses revealed no sig- 
nificant relationship between numbers of nearby traps, 
amount of nearby disturbed habitat and the probabil- 
ity of parasitism assigned to focal cells. 


Comparison of Inputs and Outputs 


The FHASM approach required input values for man- 
agement effort and efficiency (success rate) of simu- 
lated cowbird trapping and shooting programs. These 
values were specified for each of five subregions of the 
installation, but numbers of birds killed were added 
together to produce a single cowbird result variable 
for the entire installation. A single value for probabil- 
ity of parasitism across the whole installation was as- 
signed to each cell of the model. Probability of para- 
sitism was used, along with other factors (for details 
see Trame et al. 1997), in Pease and Grzybowski's 
(1995) mathematical model to generate breeding sea- 
son total number of offspring for each vireo breeding 
pair. FHASM did not generate maps of cowbird loca- 
tions or cumulative occurrences through time. It was 
not possible to assess the risk of parasitism in different 
regions of the installation. However, the approach 
used in FHASM incorporated the influence of cowbird 
parasitism in simulations of vireo and warbler breed- 
ing populations. In FHASM simulations, overall para- 
sitism and its effects on fecundity responded to differ- 
ent trapping strategies, which allowed modeling of 
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Figure 49.4. Example GIS layer output from the first stage of 
the individual cowbird behavior model (ICBM), illustrating simu- 
lated cumulative occurrences (in days) of feeding brown- 
headed cowbirds (Molothrus ater) on Fort Hood. Darker cells 
experienced higher cumulative occurrences. Cells outside 
boundary represent cowbird feeding on nearby private lands. 
Figure created using GRASS GIS software. 


comparative scenarios by endangered species man- 
agers on Fort Hood. 

The ICBM was developed to provide spatially ex- 
plicit parasitism risks (and to simulate alternative trap 
management scenarios, an extension that is beyond the 
scope of this chapter). We specified the proportion of all 
potential cowbird breeding territories that received a 
cowbird upon model initialization. We provided spa- 
tially explicit habitat-quality levels for both cowbirds 
and cattle herds and for the sites of landscape features, 
such as water sources and supplemental feeding or salt- 
ing stations, that significantly affect cattle behavior. The 
first stage of the ICBM produced spatially explicit cu- 
mulative occurrence counts of feeding female cowbirds 
(Fig. 49.4). When the second stage of the ICBM was ap- 
plied, some of the female cowbirds were trapped and 
killed, and the ICBM generated spatially explicit terri- 
tory occupation data, showing the cumulative number 
of days each breeding territory was inhabited by a 
breeding female cowbird (Fig. 49.5). Using GRASS for 
further analysis, this output was used to generate prob- 
abilities of parasitism for each grid cell and supported 


Figure 49.5. Example output following the application of the 
second stage of the individual cowbird behavior model (ICBM), 
in which some female brown-headed cowbirds (Molothrus ater) 
have been trapped. This GIS layer illustrates the total length of 
territory occupation (in days) for each female cowbird on Fort 
Hood. Darker cells experienced longer lengths of occupation. 
Figure created using GRASS. 


risk-analysis of parasitism of endangered species. As 
with FHASM output, endangered species managers can 
specify different trapping scenarios (in much greater de- 
tail using the ICBM, see Harper et al. 2001) and com- 
pare between scenarios for effectiveness in removing fe- 
male cowbirds and in reducing parasitism rates across 
different regions of the installation. Field validation was 
conducted to test ICBM predictions of cattle and cow- 
bird occurrences, and in a separate validation study, the 
ICBM was parameterized and applied to a landscape 
that has not yet experienced any cowbird control. Re- 
sults from these studies are currently being documented 
(A.-M. Shapiro in preparation). Recent work created a 
link between the FHASM and ICBM models. In the fu- 
ture, FHASM will be able to access spatially explicit 
parasitism probabilities from the ICBM instead of using 
a single value for every cell across the landscape (Trame 
et al. 1999). 


Issues of Scale 


The ecology and management of cowbirds is well 
suited to landscape simulation since the cowbird 
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responds to landscape-level patterns in feeding and 
breeding habitats. Because they do not protect their 
young or a nest, female cowbirds can range large 
distances in search of suitable feeding areas. Re- 
searchers have reported maximum daily movements 
between breeding territories and feeding sites as much 
as 7 to 18 kilometers (Rothstein et al. 1984; Cook et 
al. 1998, respectively). Management and control of 
cowbirds must target the entire landscape. 

Recent telemetry studies on Fort Hood have 
demonstrated that livestock pastures and bird feeders 
located on lands outside the installation boundary in- 
fluenced the behavior of female breeding cowbirds on 
Fort Hood (Cook et al. 1998); in response, the ICBM 
included off-post landscape and livestock (Trame et al. 
1999). It seems likely that cattle herds and endangered 


species nests located relatively close to places with 
abundant livestock experience more occurrences of 
cowbirds. However, the need for a female cowbird to 
return to her breeding territory may scatter individu- 
als more than expected by livestock patterns alone. 
The spatial scale at which patterns develop depends 
on the landscape context of all available breeding 
habitat and feeding habitat but are not currently un- 
derstood. Theoretically, output from the ICBM could 
be analyzed to search for and quantify the scale at 
which cowbirds congregate in feeding areas on a land- 
scape. Such an analysis would be desirable in the fu- 
ture and should further elucidate the ability of individ- 
ual-based models to generate landscape-level insights 
that are not possible using aggregated approaches 
such as that applied in FHASM. 


CHAPTER 
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Forests of Canada 


Pierre R. Vernier, Fiona K. A. Schmiegelow, and Steve G. Cumming 


Ec wildlife conservation in forested land- 
scapes managed for multiple objectives increas- 
ingly relies on models to predict the outcome of al- 
ternative management scenarios on the distribution 
and abundance of focal species. Habitat models 
based on remotely sensed data such as forest inven- 
tories or satellite imagery are inexpensive to develop 
compared to models based on detailed vegetation 
data collected in the field (e.g., Venier and Mackey 
1997) and may be as effective in predicting abun- 
dance at the spatial scales considered here (F. K. A. 
Schmiegelow et al. unpublished analysis). As our 
goal is prediction at spatial extents commensurate 
with forest management planning, candidate inde- 
pendent variables should be derivable from available 
spatial data. We intend to use the models such as 
those presented here within a spatially explicit land- 
scape simulation model of wildfire and stand dynam- 
ics (Cumming et al. 1998; see also He and Mladenoff 
1999), which is initialized from digital forest inven- 
tories. Thus, our objective was to develop habitat 
models using only such data. Once developed and 
validated, these models will be used to evaluate the 
consequences of alternative management activities 
and policies over spatial extents of several thousand 
square kilometers and time horizons of at least one 
hundred years (e.g., Hansen et al. 1993, 389953. 

In this study, we used bird survey and forest in- 


ventory data from the mixedwood region of the bo- 
real forest in Alberta, Canada, to model abundances 
of six bird species as functions of habitat characteris- 
tics measured at two spatial scales (see Maurer 1985; 
Addicott et al. 1987; Karl et al., Chapter 51). We 
refer to these scales as the local and neighborhood 
scales, which are intended to represent a habitat 
patch within a spatial context. At the local scale, 
commensurate with the territory sizes of species con- 
sidered here (F. K. A. Schmiegelow unpublished 
data), and with the resolution at which bird observa- 
tions were recorded (see Methods), we measured for- 
est characteristics such as stand height and crown 
closure in a 100-meter-radius buffer. We defined a 
neighborhood as a 400-meter-wide buffer surround- 
ing the local habitat patch, where we measured the 
abundance and configuration of different forest 
cover types and anthropogenic features. The neigh- 
borhood size was selected, in part, to be consistent 
with the extent at which other ecological phenom- 
ena, such as fire ignition and spread, are represented 
in our landscape simulator. In some recent assess- 
ments of the effects of habitat loss and fragmentation 
on forest birds, our neighborhood scale is considered 
a landscape (Edenius and Sjóberg 1997; Drolet 
1999). 
our spatial extents to be consistent with a broad 
interpretation of Johnson's (1980) second-order 
habitat selection in which habitat composition and 


et al. However, we considered both 


DDS) 
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configuration are characterized at multiple spatial 
extents, at the level of territories. Our specific objec- 
tives here were to develop statistical habitat models 
relating observed bird abundances to local and 
neighborhood habitat characteristics measured from 
forest inventory data and then to estimate the predic- 
tive performance of the habitat models using a statis- 
tical cross-validation approach. We did not attempt 
to determine the relative (and unique) contributions 
of local and neighborhood habitat characteristics on 
bird species abundance. We are exploring this issue 
further. 


Methods 


Our study area encompassed about 140 square kilo- 
meters of boreal mixedwood forest near Calling Lake 
in north-central Alberta, Canada (55°N, 113°W; Figs. 
50.1 and 50.2). Mean summer (early June through 
mid-August) precipitation in the region is about 320 
millimeters, accounting for more than 70 percent of 
the total yearly precipitation; July is generally the 
wettest month. The mean summer temperature is 
12.0°C, and the mean freeze-free period is eighty-five 
days (Strong and Leggat 1981). Trembling aspen 
(Populus tremuloides), balsam poplar (P. balsamifera), 
and white spruce (Picea glauca) are the most abundant 
upland tree species, often occurring together in old, 
mixed stands, whereas black spruce (P. mariana) char- 
acterizes wetter sites (Strong and Leggat 1981). The 
dominant understory shrubs are alder species (Alnus 
tenuifolia, A. crispa) with lesser amounts of willow 
(Salix spp.). Various fruiting shrubs (Rubus, Rosa 
spp.), sarsaparilla (Aralia nudicaulis), and other 
herbaceous plants dominate the lower strata. 


Bird Surveys 


We used bird abundance data collected by point-count 
surveys conducted between 1993 and 1998 as part of 
the Calling Lake Fragmentation Experiment and re- 
lated studies (e.g., Schmiegelow et al. 1997). A total of 
406 permanent sampling stations were located within 
sixty-five sites, which we define as contiguous areas of 
similar forest type and age. Site types included areas 
clearcut in 1993 as part of the experimental design, 
young and old’ deciduous forests, mature coniferous 


Figure 50.1. Location of the Calling Lake Study Area in north- 
central Alberta, Canada. 


forests, and mixedwood forests. There was at least 
200 meters between each sampling station. In every 
year that a station was sampled, point counts were 
conducted five times during the breeding season at 
ten-day intervals from the third week in May through 
early July. Upon arrival at a station, observers waited 
for one minute and then recorded all birds seen and 
heard during a five-minute sampling interval, within 
50-meter and 100-meter distance classes. All individu- 
als recorded at a station during a given visit were 
mapped, and movements during the five-minute sam- 
pling period noted, ensuring that individuals were 
recorded only once (see Ralph et al. 1993). Sampling 
effort, in years, ranged from one to six years across 
stations, as resources allowed additional survey points 
to be added to the main experimental design described 
by Schmiegelow et al. (1997). 

We developed models for six bird species (Table 
50.1): black-capped chickadee (BCCH; Poecile atri- 
capilla), black-throated green warbler (BGNW; Den- 
droica virens), red-breasted nuthatch (RBNU; Sitta 
white-throated (WTSP; 
Zonotrichia albicollis), yellow-rumped warbler 
(YRWA; D. coronata), and yellow warbler (YWAR; 
D. petechia). This suite of species follows Schmie- 
gelow and Hannon (1999), representing a range of 


canadensis), sparrow 


observed abundances and expected responses to forest 
fragmentation. For each bird species, we calculated 
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Figure 50.2. Distribution of forest habitat (light gray), 
clearcuts (white), lakes (dark gray), nonforest habitats (medium 
gray), and bird sampling stations in the Calling Lake Study 
Area. 


the mean abundance per station per year (after 
Schmiegelow et al. 1997) and multiplied (weighted) 
this by the number of years a station was sampled 
(between one and six years). We used this aggregated 
count value as our response variable in subsequent 
statistical modeling. We did not include 1994 data 
from experimental areas where transient effects had 
been documented. This applied to ninety stations sur- 
rounded by the 1993 clearcuts, where temporary 
crowding of birds occurred (see Schmiegelow et al. 
1997 for details). 


Habitat Variables 


Habitat patterns around each bird sampling stations 
were quantified using 1:20,000-scale digital Alberta 
Vegetation Inventory (AVI) maps. These maps were 
produced by interpretation of 1:15,000-scale aerial 
photography, updated to 1994. The digital maps con- 
tain several data layers that are potentially useful for 
modeling wildlife habitat relationships. The forest 
cover layer links stand polygons to forest attributes 
such as species composition, crown closure, height, 
and estimated stand age. This layer also includes in- 
formation on nonforest cover types such as permanent 


clearings, lakes, and wetlands. Two other map layers 
described the location of streams, logging roads, and 
seismic cutlines. We developed a habitat classification 
system based on the overstory and understory tree 
species (or genus in the case of Populus), stand age, 
and management history (Table 50.2). The classifica- 
tion system was used to create a generalized map of 
forest and nonforest habitat classes within the study 
area. The point count stations were georeferenced and 
linked to the AVI spatial database. 

We used the original and derived map layers to 
measure habitat characteristics around each bird- 
sampling station at two spatial scales (Table 50.3, Fig. 
50.3): the local scale, which matched the size and 
shape of the circular bird-sampling stations (inner 
buffer of 100-meter radius, or 3.14 hectares), and the 
neighborhood scale, which extended from 100 to 500 
meters beyond the sampling stations (outer buffer, 
75.4 hectares). The habitat characteristics we chose 
have either previously been used in the literature or 
were hypothesized correlates of species abundance 
based on the ecology of the species. 

Seven variables characterized the structure and 
composition of the inner buffers (Table 50.3). A cate- 
gorical variable (having discrete, unordered values) 
specified the habitat class at the origin of the station. 
Four continuous variables quantified the size of the 
habitat patch containing the origin and the area 
weighted means of canopy height, crown closure, and 
proportion of deciduous species in the canopy, for 
forested habitats intersecting the buffer. Two index 
variables coded the presence/absence of streams and 
anthropogenic edges within the buffer. 

Twelve variables characterized the structure and 
composition of neighborhoods, or outer buffers. Five 
of these variables measured the proportional areas of 
deciduous forest, mixedwood forest, clearcuts, mid- 
seral forest (fifteen to ninety years), and late-seral for- 
est (more than ninety years). Four variables indexed 
the presence of white spruce, black spruce, water, and 
anthropogenic habitat. We also derived three variables 
descriptive of neighborhood spatial structure. Simp- 
son’s diversity index (N_SIMP) measures the number 
of patch types and their relative abundance. The index 
represents the probability that two randomly selected 
patches belong to different patch types. The higher the 
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TABLE 50.1. 


Distribution and abundance of bird species at Calling Lake, Alberta, Canada, from 1993 to 1998. 


No. of Mean Mantel 

Code Common name/Scientific name2 stations detections signif.^ 
WTSP White-throated sparrow 

Zonotrichia albicollis 356 3.96 0.000 
YRWA Yellow-rumped warbler 

Dendroica coronata 327 2a 0.000 
RBNU Red-breasted nuthatch 

Sitta canadensis 266 0.61 0.000 
BGNW Black-throated green warbler 

Dendroica virens 224 0.92 0.000 
YWAR Yellow warbler 

Dendroica petechia I2 Onis 0.000 
BCCH Black-capped chickadee 

Poecile atricapilla T51 0.14 0.285 


aSpecies are listed from most common to least common based on the number of stations in which they were 


detected. 


bMantel signif. indicates the significance level of a randomization procedure used to test for spatial 


autocorrelation in bird counts. 


value, the greater the structural diversity. After 
gridding the classified map of habitat classes to 0.01 
hectare, we used FRAGSTATS (McGarigal and Marks 
1995) to calculate two contrast-weighted measures of 
edge density to characterize the heterogeneity and 
fragmentation of the neighborhoods. N_EDGEN and 
N_EDGEA measured the density in meters per hectare 
of natural and anthropogenic edges, respectively. Both 
measures relied on separate edge contrast matrices 


TABLE 50.2. 


Habitat classification system used to calculate several local 
and neighborhood-level habitat variables. 


Class Description 

WATER Water (lake, ice, river) 

NONFOR Nonforest and wetland 

Y_DECID > 70% deciduous and < 90 years 
O_DECID > 70% deciduous and > 90 years 
W_SPRUCE > 70% white spruce 

B_SPRUCE Leading black spruce 

PINE Leading pine 

MIXED Mixed deciduous/white spruce 
CCUT Clearcuts < 15 years 

ANTHRO Anthropogenic (wellsites, large cutlines, etc.) 


that assigned weights, ranging from 0 to 1, to adjacent 
habitat classes (Table 50.4). The weights were esti- 
mated subjectively, based on our knowledge and field 
experience. To N_EDGEA, we added the density of 
logging roads and cutlines; these were linear features 
in the underlying AVI data. 

Prior to model development, we checked the dis- 
tributional assumptions of our candidate predictor 
variables. N_PATCH was log transformed to im- 
prove normality. Numeric variables with an excess of 
zeroes were converted to binary variables (e.g., 
N_SB). There were no highly correlated pairs of pre- 
dictor variables (Pearson’s r > 0.75). We checked for 
nonlinear relationships between bird abundances 
and our continuous variables using scatterplots 
with lowess smoothers. We found no evidence of 
nonlinearity. 


Statistical Analysis 


We used generalized linear models (GLM; McCul- 
lagh and Nelder 1989) to model the response of bird 
species to local and neighborhood habitat character- 
istics. GLMs can represent a greater variety of rela- 
tionships between response and explanatory vari- 
ables than can linear regression models, and do not 


TABLE 50.3. 
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AVI-based habitat variables. Local habitat variables were measured within a 100-meter radius while neighborhood variables were 
measured in a 400-meter radius beyond each local (inner) buffer. 
ilc E OM ee 


! Range of 

Variable Variable type values Description 

Local : 

LICEUN: Dummy coded 7 classes Habitat types in which stations were located (see Table 50.2 for descriptions) 

L_MIXED, 

L ODEC, 

ESRINE SSB; 

L_SW, L_YDEC 

L_SIZE Numeric 0.5-703.4 ha Patch size; relies on a habitat classification system (Table 50.2) 

LE DIST Numeric 0-1238.9 m Distance of station center to nearest anthropogenic edge (habitat classes 9 and 
10) 

L_CROWN Numeric 0-85.5 % Mean crown closure among forested polygons 

DEC Numeric 0-1.0 Mean deciduous proportion of forested polygons 

L_HT Numeric 0-31.0 m Mean stand height of forested polygons L_STREAM Binary O or 1 Presence of 
streams or lakes 

Neighborhood 

N_CUT Numeric 0-0.66 Proportion of neighborhood in a clearcut 

N. MID Numeric 0-0.99 Proportion of neighborhood in mid seral forest (15-90 years) 

N LATE Numeric 0-1.00 Proportion of neighborhood in late seral forest (more than 90 years) 

N. DEC Numeric 0-1.00 Proportion of neighborhood in deciduous forest 

N. MIXED Numeric 0-0.77 Proportion of neighborhood in mixedwood forest 

N_SB Binary (0) (oye dt Presence of black spruce forest 

N. SW Binary Oor 1. Presence of white spruce forest 

N ANTHRO Binary O or 1. Presence of anthropogenic features (well sites, clearings, gravel pits, highways, 
etc.) 

N, WATER Binary Oor1 Presence of lakes, ponds, etc. 

N. SIMP Numeric 0-0.83 Habitat patch diversity measured using Simpson's index 

N. EDGEA Numeric 0-319.2 m/ha Anthropogenic edge density calculated using habitat classification system (Table 
50.2) and edge contrast matrix (Table 50.4) 

N_EDGEN Numeric 0-85.3 m/ha Natural edge density calculated using habitat classification system (Table 50.2) 


and edge contrast matrix (Table 50.4) 


assume constant variance. Because our response vari- 


ables are counts, we assumed a Poisson error distri- 


 log(expected count) = log(effort) + bo + bix; + b2x2 


Fe HER 


bution (McCullagh and Nelder 1989; Jones et al., 


Chapter 35). Thus for a set of n explanatory vari- 
ables, the Poisson regression model specifies that the 
distribution of responses is Poisson and that the log 
of the mean is linear in the regression coefficients: 


where effort, the number of years a station was sam- 
pled, is an offset (StataCorp 1999) to correct for vari- 
able sampling effort between stations. This model of 


sampling effort is formally equivalent to assuming 


564 


PREDICTING SPECIES OCCURRENCES 


Figure 50.3. Example of inner and outer buffers placed 
around one of the bird survey stations. White areas represent 
clearcuts and shades of gray represent different forest cover 
types. 


TABLE 50.4. 


that, within stations and species, annual counts are in- 
dependent, identically distributed Poisson random 
variates. As an informal test of model specification, we 
used the deviance-based dispersion statistic (sum-of- 
squared deviance residuals divided by degrees of free- 
dom), or mean deviance per sample. Values greater 
than 1.5 indicate overdispersion, implying, for count 
data, that a negative binomial link may be more 
appropriate. 

For each bird species, we selected explanatory vari- 
ables from the candidate set by a backward stepwise 
procedure (P-to-enter « 0.001, P-to-remove « 0.0015). 
A conservative level of significance was chosen as a 
correction for multiple tests. Significance levels were 
based on standard likelihood ratio tests. Model 
strength was measured using the percent of deviance 
explained. This measure is analogous to the multiple 
coefficient of determination (R2) and measures 
the proportion of the deviance in the independent 


Anthropogenic (top number) and natural (bottom number) edge contrast values used to calculate 


N EDGEA and N EDGEN, respectively. 


thro water nonfor ydec odec Sw sb pine mixed ccut an 
water 0.00 
0.00 
nonfor 0.00 0.00 
1.00 0.00 
ydec 0.00 0.00 0.00 
100 — O77) 000 
odec 0.00 0.00 0.00 0.00 
1000. 0.75) 0.5019". 0:00 
sw 0.00 0.00 0.00 0.00 0.00 
A410) O74 onmes  (e(o T000 
bs 0.00 0.00 0.00. 0.00 0.00 0.00 
ALO SOAs)  (eHsio (ours) (Hs 0.00 
pine (0)f0y0) *— 01002000000000 0.00 0.00 
dLí0I9) O75 050 1050 0.50 0.50 0.00 
mixed 0.00 | 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
AN) OLS Oss) O25 One 0.75 0.50000 
ccut HOOT O50 — (Odsfoysbe (0. ms 0.75 0.50 Qus9) (uS 0.00 
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
anthro 1.00 1.00 1.00 1.00 1.00 1.00 1.00 AFO 0.00 0.00 
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
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variables associated with the deviance in the depend- 
ent variable (Cameron and Windmeijer 1997). 

To evaluate the relative influence of local and 
neighborhood habitat variables on each species, we 
compared five alternative habitat models. At the ends 
of the spectrum were the null and full models. The 
null model was simply the mean count over all sta- 
tions while the full model included all local and neigh- 
borhood variables. The three intermediate models 
used subsets of the variables selected by the backward 
stepwise procedure: local variables only, neighbor- 
hood variables only, or both sets of variables. We used 
Aikaike's Information Criteria (AIC, Akaike 1974) to 
select the best of the five models. AIC measures the 
tradeoff between model goodness of fit (measured as 
the log-likelihood) and model parsimony (measured 
by the number of parameters included in the model). 

We assessed model assumptions by examining diag- 
nostic plots and maps of the response variables and 
the model residuals. Plots of deviance residuals against 
the linear predictor, and a normal scores plot of stan- 
dardized deviance residuals, were used to identify 
skewness and outliers and to assess the overall behav- 
ior of the model. Plots of approximate Cook statistics 
against leverage/(1 — leverage) and case plots of Cook 
statistics allowed us to identify potentially influential 
observations and their location in the dataset (i.e., 
case number), respectively. These diagnostic methods 
follow Scott (1997). The maps allowed us to visually 
examine the assumption that counts were independent 
between stations. This assumption may fail, as counts 
at stations within the same site are likely to be corre- 
lated. This is because the mean distance between such 
stations is low and because sites are, by assignment, 
relatively homogeneous. Preliminary inspection of the 
mapped counts indicated the need to test for spatial 
autocorrelation. We used Mantel's test (Mantel 1967), 
which is based on a scalar measure of the association 
between two distance matrices, one describing the ab- 
solute difference in counts between stations and the 
other the Euclidean distance between stations. Signifi- 
cance is tested with respect to a simulated distribution, 
obtained by re-computing the statistic under five thou- 
sand random permutations of one of the matrices 
(Manly 1997). Where there was evidence of spatial 


autocorrelation in counts, we repeated Mantel's test 
on model residuals (after building regression models). 

The diagnostic plots and maps revealed that the 
most obvious problems were the presence of a few 
large residuals and influential observations, as well as 
some unexplained spatial autocorrelation (see Figs. 
50.4 and 50.5 for examples). Visual inspection of the 
maps indicated that spatial dependence was primarily 
the result of within-site correlation, as expected. Our 
solution to both of these problems (influential obser- 
vations and spatial autocorrelation) was to use 
STATA’s cluster option to calculate variance estimates 
that are robust to both influential observations and 
within site correlation (StataCorp. 1999). Using the 
cluster option has the additional benefit of adjusting 
for any undetected overdispersion. 

We estimated model uncertainty using a leave-one- 
out cross-validation approach (Burt and Barber 1996). 
This was done by eliminating observations one at a 
time and predicting this dependent value with a re- 
gression model estimated from the remaining observa- 
tions. We then used the prediction residuals (n = 406) 
to calculate two criteria to assess the performance of 
our habitat models: the percent deviance explained 
(described earlier) and s2, the sum of squared deviance 
residuals divided by the residual degrees of freedom, 
which measured the mean prediction error. The bias in 
each criterion was then calculated as the difference be- 
tween the mean of the 406 leave-one-out statistics 
(e.g., 62) and the actual value obtained when all ob- 
servations are included. 


Results 


The overall distribution and abundance of each 
species is summarized in Table 50.1. Of 102 bird 
species recorded during point count surveys, the six 
species considered here rank among the thirty most 
abundant, with white-throated sparrow and yellow- 
rumped warbler being the most and second-most 
abundant species, respectively. The ranges of values of 
our predictor variables are summarized in Table 50.3. 
The contrast between sites for our continuous vari- 
ables is nearly as high as possible, given the overall 
composition of the study area. 

Statistically significant regression models were 
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Figure 50.4. Diagnostic plots for the black-throated green warbler showing the existence of some large 
residuals (top panels) and influential observations (bottom panels). Top left: plot of deviance residuals 
against linear predictor; top right: normal scores plots of standardized deviance residuals; bottom left: plot 
of approximate Cook statistics against leverage/(1 — leverage); bottom right: case plot of Cook statistic. 


developed for five out of six species (Tables 50.5 and 
50.6), the exception being the black-capped chick- 
adee. None of these models had deviance statistics 
suggesting overdispersion. We conclude that our 
choices of link and variance functions are appropriate 
and that the models are correctly specified. 

For three species, the best of the five alternate mod- 
els we considered were those including both local and 
neighborhood habitat variables. The models explained 
between 54 and 73 percent of the variation in abun- 
dances of the black-throated green warbler, yellow- 
rumped warbler, and white-throated sparrow. For the 
red-breasted nuthatch and yellow warbler, the best 
models included only local habitat variables and ex- 


plained only 43 and 37 percent of the variation in bird 
abundance, respectively. The model comparison 
process is summarized for the black-throated green 
warbler in Table 50.7. Although the three stepwise 
models (local, neighborhood, and local plus neighbor- 
hood variables) are all highly significant, the third is 
clearly the best. BGNW abundance was positively re- 
lated to local canopy height (L HT) and to features of 
the neighborhood (N DEC, N_SIMP, N LATE, 
N. SW, N. CUT). Of these, the first three were most 
significant. High abundances of this species are associ- 
ated with areas of older, structurally diverse, decidu- 
ous dominated forest, containing at least some white 
spruce. The relatively weak positive association with 
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Figure 50.5. Simplified spatial distribution of counts for the 
black-throated green warbler. Empty circles indicate absence of 
BGNW (n = 182), gray circles indicate mean abundance is less 
than or equal to 1 (n = 112), and black circles indicate mean 
abundance is greater than 1 (n = 112). 


clearcuts may be spurious due to the distribution of 
available habitat relative to harvested areas (see 
Discussion). 

The white-throated sparrow was positively related 
to two local (L CCUT, L DEC) and two neighbor- 
hood (N. DEC, N_SIMP) variables, and negatively re- 
lated to one local (L PINE) and two neighborhood 
(N. EDGEN, N MID) variables. The most important 
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predictors were the presence clearcuts, the local abun- 
dance of deciduous forest, and the proportion of mid- 
seral forest in the neighborhood. This species was 
most abundant in clearcuts adjacent to deciduous for- 
est, with a low proportion of midseral forest in the 
surrounding area. 

The yellow-rumped warbler was negatively associ- 
ated with four local variables (L CCUT, L ODEC, 
L SIZE, L_YDEC) and positively associated with two 
neighborhood variables (N. LATE, N. MID). The 
presence of both clearcuts and old deciduous forest 
were the most important predictors at the local level, 
while the proportion of mid- and late-seral forest were 
influential at the neighborhood level. Stations located 
in either clearcuts or old deciduous forests had the 
lowest abundance of YRWA whereas stations with a 
high proportion of young and mature forest in the sur- 
rounding area had the highest abundance. 

Both the red-breasted nuthatch and yellow warbler 
were best predicted by models consisting of only two 
local habitat variables. RBNU was associated with tall 
coniferous or coniferous dominated stands; YWAR 
was associated with old, relatively open patches of de- 
ciduous forest. Neither species was sensitive to any of 
the neighborhood variables we measured, after ac- 
counting for variability in local habitat characteristics. 

Our diagnostic plots and maps are illustrated for 
one species (BGNW, Figs. 50.4 and 50.5). Influential 
observations and potential outliers (indicated by high 
residual values) were present. Similarly, the map of the 


Summary statistics for each bird species' best model (i.e., lowest AIC) indicating the 
number of local and neighborhood variables selected, degrees of freedom, the null 
model deviance (assuming only a mean effect), the residual deviance unaccounted for 
by the model, the percent deviance explained, and AIC values. 


Summary statistic BGNW? RBNU? WTSP2 YRWA? YWAR? 
Local variables a 2 E 4 2 
Neighborhood variables 5) 0 4 2 0 
Degrees of freedom 399 403 398 399 403 
Null model deviance 582.73 252.08 984.60 520.65 122115 
Residual deviance 266.43 143.29 267.69 220.91 77.45 
% deviance explained 54.28 43.16 TBI 5797 36.59 
AIC 751.49 619.35 1364.67 1105.30 248.90 


aSee Table 50.1 for bird code definitions 
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TABLE 50.6. 


Coefficients, robust standard errors, and a measure of 
variable importance for habitat variables selected in the best 
model developed for each species. 


Robust 
standard Importance’ 

Species Variable Coefficient error? (96) 

BGNW Lal 0.105 0.021 24.7 
N_CUT 1.348 0:533 3.9 
N_LATE 2.333 0.461 2 
N_DEC 2.195 0.300 2197 
N.SW 0.459 0.136 4.4 
N. SIMP 3.344 0.500 14.4 
Constant -8.853 0.650 

RBNU L DEC -0.875 0.148 43.2 
L_HT 0.099 0.009 115 
Constant -3.440 0.246 

WISP L CCUT 2.463 0.229 38.0 
L PINE -1.356 0.366 WS 
L_DEC 2.366 0.226 32.8 
N. MID -1.555 0.160 S 
N. DEC 0.915 0.265 9.4 
N. SIMP 1.475 0.289 13.9 
N EDGEN -0.008 0.002 4.2 
Constant -2.862 0.282 

YRWA L CCUT -2.638 0.438 36.7 
L ODEC -0.336 0.063 7.8 
L YDEC -0.511 0.179 6.4 
L SIZE —0.001 0.000 8.6 
N. MID 1055 0.303 8.6 
N LATE 1.000 0.284 al 
Constant | -0.932 0.238 

YWAR L ODEC 1.768 0.276 18.6 
L CROWN -0.029 0.005 30.1 
Constant -3.081 0.255 


aAll variables are significant at the 5 percent level using robust 
standard errors. 

bThe importance of a variable within the final model is measured by 1 
minus the ratio of the deviance explained by the final model and the 
deviance explained by a reduced model where the variable was 
excluded. 


spatial distribution of counts provided some visual ev- 
idence of spatial autocorrelation (Fig. 50.5). This was 
confirmed using Mantel's test (Table 50.1). Spatial au- 
tocorrelation of the model residuals was less striking, 
but Mantel’s test statistic was still significant (0.05 > p 
> 0.01). Similar patterns were observed for the other 
species, except the black-capped chickadee, whose 


counts were not spatially correlated. These results jus- 
tify our use of robust variance estimators (Table 50.6). 

Cross-validation analysis revealed that the predic- 
tive ability of the Poisson regression models were 
neither under- nor overestimated. The strengths of 
the original models versus the cross-validated mod- 
els, as measured by percent deviance explained, were 
within 0.2 percent of each other. Similarly, estimates 
of prediction error using the sum-of-squared de- 
viance residuals divided by the residual degrees of 
freedom (s2) agreed to three significant digits. There 
was no evidence of bias in either of the two measures 
that we used for assessing the predictive ability of the 
models. 


Discussion 


In general, our quantitative models are consistent with 
qualitative accounts of habitat requirements for the 
selected species (e.g., Ehrlich et al. 1988; Semenchuck 
1992; Kaufman 1996; Fisher and Acorn 1998), and 
we do not dwell on specific interpretations of variables 
here. A notable anomaly, however, was inclusion of 
the proportion of clearcut in the neighborhood as a 
positive predictor of black-throated green warbler 
abundance. As the species has been identified as one at 
risk due to forest harvesting (Schmiegelow and Han- 
non 1999; Norton 1999), the result was surprising. 
We believe it to be an artifact of the nonrandom har- 
vest of suitable BGNW habitat in areas adjacent to 
sampled sites in order to satisfy an experimental de- 
sign. Many sampling sites at which BGNWs were 
recorded were previously contiguous stands of older, 
deciduous-dominated forest fragmented by harvesting. 
Of the five species we modeled, the BGNW showed 
the greatest affinity for these older stands, and thus 
occupied sites were closer in proximity to harvested 
areas than expected by chance. Nevertheless, such 
post hoc explanations emphasize the importance of 
model validation and will be discussed later in the text 
in some detail. 

Àn interesting outcome of our analyses was the 
variation in the inclusion of local and neighborhood 
habitat descriptors in species models. We quantified 
habitat composition and configuration at two scales 
in order to test whether the spatial context of habitat 
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TABLE 50.7. 


5619 


Summary of alternative habitat models for the black-throated green warbler. 


% Deviance 


Model Habitat variables Df Deviance? explained? . AlCab 
Null 405 582.7 
Local L MIXED, L ODEC, L HT 402 333.5 42.8 812.6 
Neighborhood N_LATE, N_DEC, N_SIMP 402 364.7 37.4 843.8 
Local + neighborhood L HT, N CUT, N LATE, 399 266.4 bys} 15485. 
N. DEC, N. SW, N. SIMP 
Full All local + neighborhood 
variables 380 2525 56.7 775.6 


aDeviance, percent deviance explained, and AIC are explained in the text. 


bThe model with the lowest AIC is italicized. 


patches affected habitat selection at the level of terri- 
tories. Our results suggest that, for some species, 
habitat quality at the level of territories (as approxi- 
mated by long-term mean species abundances) is me- 
diated by characteristics of the surrounding area. In 
other cases, only local characteristics explained vari- 
ation in abundances. In the latter cases, explained 
deviance of our models was lower than for species 
models that included habitat descriptors at multiple 
scales. We present two contrasting interpretations of 
these results. 

First, species responding only to local habitat char- 
acteristics may have selected specific features, regard- 
less of their spatial context, and these features were 
not well described by relatively coarse-scale habitat 
data. Second, such species represent generalists rela- 
tively insensitive to both habitat composition and con- 
figuration, regardless of scale. We can test hypotheses 
arising from the first interpretation with extensive, de- 
tailed vegetation data collected in our study area over 
the same time period during which birds were sam- 
pled. The second interpretation is testable in the ana- 
lytical framework presented here, with a larger suite 
of species. Our selection of species for preliminary 
model development was not made on this basis. Re- 
gardless, we emphasize that the species models con- 
taining only local habitat descriptors, although poorer 
than those containing both local and neighborhood 
variables, were still able to account for about 40 per- 


cent of the variation in abundance using, in each case, 


only two habitat variables derived from forest inven- 
tory data. 

A fundamental criticism of habitat-based models is 
that they are rarely validated (Hansen et al. 1999). As 
a first step, we employed a statistical cross-validation 
approach. However, because we are ultimately inter- 
ested in testing hypotheses about landscape dynamics 
and wildlife response, our predictive models must be 
validated across a range of spatial scales and geo- 
graphic locations. We plan to use spatially referenced 
bird data from other localized studies in the boreal 
mixedwood for geographic validation. At a coarser 
scale, survey data in the form of Breeding Bird Survey 
routes and the Alberta Breeding Bird Atlas permit us 
to test whether our finer-scale models can be general- 
ized, or in other words, whether the spatial scale of re- 
sponse to habitat varies. 

Another criticism of habitat-based abundance 
models is that abundance in a given habitat is not 
necessarily indicative of quality, as measured by re- 
productive success (Van Horne 1983). If occupation 
of sink habitats (those where expected reproduction 
is below replacement) is limited by immigration from 
nearby source habitats (e.g., Pulliam 1988), then 
projections of population abundance may fail to pre- 
dict actual population persistence. In the absence of 
productivity measures, we use long-term mean abun- 
dance as a proxy for habitat quality (see also argu- 
ments in Boyce and McDonald 1999). We assume 
that the system is dynamic and unsaturated, and that 
optimal habitat will be most frequently occupied, 
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subject to variation imposed by environmental sto- 
chasticity, individual mortality, and dispersal limita- 
tion, or in other words, an ideal free distribution 
(Fretwell and Lucas 1969). 

Occupancy of suboptimal habitats will also be a 
function of their density. In regions where suboptimal 
habitats make up a large portion of the landscape, sur- 
plus production from the rare nearby source habitats 
is unlikely to provide colonists to occupy sink habitats 
with a frequency sufficient to permit estimation of 
their relative suitability. In such areas, observed abun- 
dances in areas of relatively low predicted quality 
should be less than expected, given our models. Thus, 
our models can be used to design extensive sampling 
efforts to test metapopulation models and identify 
sink habitats. In such an application, source habitats 
would be identified as areas of both high predicted 
quality and high observed abundances. 

This again points to the necessity for model valida- 
tion across multiple sites and scales (extent and grain), 
as source habitat detection is dependent on predicted 
habitat quality. In the landscape from which the pres- 
ent models were derived, the contrast in continuous 
variables (at local and neighborhood scales) was 
nearly as high as possible (see Table 50.3) given the 
overall composition of the study area, but this area is 
dominated by older forest. An efficient way of pro- 
ceeding may be to use our existing models to locate 
new sampling sites in areas with high contrast in inde- 
pendent variables that are highly significant in existing 
models, or where uncertainty (prediction variance) is 
high. Criteria based on the expected influence of new 
Observations on parameter estimates or other meas- 
ures based on Fisher's information matrix can also be 
devised. In addition, future sampling sites should tar- 
get areas with contrast in independent variables that 
are anticipated to change most as industrial develop- 
ment in the region proceeds (e.g., the density of an- 
thropogenic edges). 

In Alberta, forest management planning is largely 
based on forest inventory information, but the ability 
of such information to predict species abundances 
has not previously been evaluated. We attempted 
such an evaluation, using Poisson regression analysis 
to model the relationship between bird species abun- 
dances observed in the field and habitat characteris- 


tics derived from forest inventory data. Poisson re- 
gression deals explicitly with characteristics of count 
data and is generally more efficient and consistent, 
and less biased than linear regression models using 
the same data (Scott 1997). For these reasons, it is 
receiving increasing attention in the ecological litera- 
ture (e.g., Nicholls 1989; Bustamante 1997). How- 
ever, it is still necessary to check model assumptions, 
including statistical independence of observations, 
correct specification of the link and variance func- 
tions, correct scale for measurement of the explana- 
tory variables, and lack of undue influence of indi- 
vidual observations on the fitted model (Scott 1997). 
In our case, residual diagnostics revealed several in- 
fluential observations, as well as spatial dependence 
in the model residuals. Hence, we used robust vari- 
ance estimates and corrected for spatial dependence 
within bird survey clusters (sites). Our final models 
demonstrated good predictive ability with no evi- 
dence of bias. We conclude that their use in land- 
scape simulations is justified pending their validation 
against independent data sets. 

We believe the approach to modeling abundance 
presented here is robust and appropriate to the ques- 
tions at hand, namely, assessing the potential ecologi- 
cal outcomes of various forest management scenarios 
in the boreal mixedwood forests of Alberta. We 
demonstrated that predictive models of bird abun- 
dance could be generated from forest inventory data, 
permitting evaluation of activities at a resolution and 
extent commensurate with management planning. The 
models presented here are a subset of those we have 
developed, representing species with a range of ob- 
served abundances and expected responses to forest 
fragmentation (Schmiegelow and Hannon 1999). Our 
final selection of species to model for scenario evalua- 
tions requires identification of species most at risk 
from land-use practices (primarily forestry and energy 
sector development) that are resulting in widespread 
habitat modification in Alberta's boreal forests (see 
Hansen et al. 1999 for a summary of approaches). 

Adaptive resource management requires eval- 
uation, prediction, and monitoring. Uncertainties 
associated with policy options are exposed as alter- 
native hypotheses, implemented management strate- 
gies are treated as experiments, and observed conse- 
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quences are used to test hypotheses and refine man- 
agement (Walters and Holling 1990). Land managers 
in Alberta have professed the advantages of an adap- 
tive approach to resource management. We hope our 
work will provide some of the tools necessary for 
this process, through identification of measurable 
parameters, and development of analytical tech- 
niques, for assessment and monitoring of manage- 
ment activities. 
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Species Commonness and the Accuracy of 
Habitat-relationship Models 


Jason W. Karl, Leona K. Svancara, Patricia J. Heglund, Nancy M. 


Wright, and J. Michael Scott 


wo types of error are possible when assessing the 

accuracy of models predicting species presence or 
absence: omission error (failure to predict species oc- 
currence in an occupied area) and commission error 
(prediction of species occurrence in unoccupied 
areas)(see Fielding, Chapter 21). Of these two, omis- 
sion errors are relatively easy to measure (Krohn 
1996; Karl et al. 2000) because observation of a 
species in an unpredicted area necessitates an omission 
error. Conversely, failure to observe a species in a pre- 
dicted area, while necessary to the definition, is not 
sufficient to classify it as a commission error (Krohn 
1996; Boone and Krohn 1999; Karl et al. 2000). This 
can be due to inefficient or inappropriate sampling, 
species life history characteristics (e.g., avoids hu- 
mans, cryptic nature, episodic), or temporal and spa- 
tial variation in species distributions (Karl et al. 2000; 
Fielding, Chapter 21; Schaefer and Krohn, Chapter 
36). Thus, field measures of commission error contain 
both true error and apparent error (Karl et al. 2000; 
Schaefer and Krohn, Chapter 36). 

Attributes of species biology can affect our esti- 
mates of model accuracy, but the effect of rarity on 
model accuracy is not well defined. It has been pro- 
posed that the presence of “species with high spatial 
and temporal evenness” (Krohn 1996) (e.g., common 
species) would be easier to predict with habitat- 
relationship models (e.g., gap analysis models) than 


species with low evenness (Boone and Krohn 1999) 
for most modeling applications. Karl et al. (2000) re- 
ported a significant decline in commission error ac- 
companied with a slight increase in omission error 
with number of species detections on two study areas 
in north Idaho. As such, apparent error decreased 
with increased sample size. However, it was unclear 
whether high error rates at low numbers of detections 
were a result of differences in model accuracy between 
rare and common species or an artifact of sample size 
used to estimate model performance. 

A rarity effect would exist if the models for species 
less-frequently encountered were less accurate than 
those for common species. Lower model accuracy for 
rare species in one situation could be caused by in- 
complete knowledge of the species’ range or habitat 
associations, or the species responding to habitat fea- 
tures that cannot be measured (or mapped). Alterna- 
tively, because large numbers of rare species detections 
often take a large investment of time and money, 
model accuracy is assessed with few data points (if 
done at all). Depending on the statistics used, accuracy 
assessment with small sample sizes could lead to erro- 
neous measures. 

We investigated whether the pattern described by 
Karl et al. (2000) was due to a rarity effect or to 
an artifact of sample size. We simulated small sample 
sizes by randomly subsampling our data set for 
the most common species and using the subset of 
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Observations to test model accuracy. By doing this, we 
held the biological attributes of species constant, vary- 
ing only the sample size. If models developed for rare 
species (i.e., those with few detections) have poorer 
prediction accuracy than common ones (rarity effect), 
then the slope of regression lines from a plot of error 
rates against number of detections for field data set 
should be steeper than that obtained by simulation. 
Although this approach did not consider reasons for 
rarity and may not appropriately approximate distri- 
bution of rare species, it was adequate for examining 
the effects of sample size on model accuracy. 


Study Area 


Our study area encompassed most of the Idaho por- 
tion of U.S. Forest Service (USFS) Northern Region 
(the Idaho Panhandle, Clearwater, and Nez Perce Na- 
tional Forests) as well as land owned by the Potlatch 
Corporation (Fig. 51.1). This area (2.75 million 
hectares) begins just north of the Clearwater River, ex- 
tending northward to the tip of the Idaho panhandle, 
but excluding the dry grasslands of the Snake River 
Valley and the Palouse agriculture lands. Most of this 
area is dominated by mixed coniferous forests in vari- 
ous stages of timber management. 


Methods 


Breeding birds were surveyed on the U.S. Forest Service 
Northern Region in 1994 to 1996 (R. L. Hutto and 
U.S. Forest Service unpublished data; P. J. Heglund, 
Potlatch Corporation unpublished data) using a vari- 
able-radius circular plot technique (Ralph et al. 
19952). Each of 1,628 survey points was surveyed one 
time per year for up to three years following the meth- 
ods described by Hutto and Hoffland (1996). 

We eliminated from the data set all birds that were 
flying when detected, except for those birds whose de- 
tections are mostly restricted to aerial foraging (i.e., 
swallows, swifts, hawks). We further truncated the 
data set to only those observations occurring within 
50 meters of the survey point for two reasons. First, 
the ability to accurately judge the distance of an ob- 
servation and the cover type in which it occurred de- 
creases with distance from the survey point (Hutto 


and Hoffland 1996; see also Scott et al. 1981). Sec- 
ond, limiting the area of analysis around the survey 
point reduces the potential for variation in the values 
of the geographic information system (GIS) data lay- 
ers around the survey point. 

We received GIS coordinates for the survey points 
from the U.S. Forest Service Northern Region's Land- 
bird Monitoring Program. These coordinates were 
digitized from geo-registered aerial photographs of the 
study area. We then converted the vector point cover- 
ages from each study area to raster grids with a 0.09- 
hectare cell size. 

We used models developed by Scott et al. (unpub- 
lished data) for the Idaho Gap Analysis Project to pre- 
dict the presence/absence of the species detected in the 
breeding bird surveys. These models were built using 
methods proposed by Scott et al. (1993) (see also But- 
terfield et al. 1994; Csuti 1996; Smith and Catanzaro 
1996) consisting of four major steps: (1) establishing a 
species list, (2) defining species range limits, (3) col- 
lecting species habitat information and determining 
habitat relationships, and (4) modeling the species 
habitat in a GIS using the information gathered. 

To assess model accuracy, we compared the model 
predictions with survey data for each species detected. 
We tallied the number of omission errors (observed, 
not predicted) and commission errors (predicted, not 
observed) and calculated percent omission (number of 
omissions divided by the total number of observa- 
tions) and commission error (number of commissions 
divided by the total number of survey points), respec- 
tively. All species measures were combined into one 
data set. We plotted omission and commission error 
by the number of species detections for all species. An 
inverse relationship existed between omission and 
commission errors (Karl et al. 2000); but, this rela- 
tionship was not easily quantifiable. For this reason, 
we treated omission and commission error separately. 
We separately regressed omission and commission 
error rates against number of detections to achieve a 
regression coefficient and standard error describing 
the relationship between model error and number of 
detections. 

We selected the seven species with more than five 
hundred detections and subjected their accuracy as- 
sessment to a simulation designed to approximate 
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Figure 51.1. The Idaho portion of the U.S. Forest Service Northern Region consists of 2.75 million hectares dom- 
inated by coniferous forest land cover types, interspersed with dry grasslands and shrublands. 


rarity. Exploratory data analysis indicated variability 
of omission error estimates was small for species with 
more than five hundred detections. Additionally, the 
seven species selected shared similar life history attrib- 
utes (i.e., broadly distributed, similar habitat associa- 
tions). For each species, we randomly selected a subset 
of its observations and estimated accuracy with this 
subset. Subset size was varied from five to the full 
number of observations for that species by increments 
of five (e.g., 5, 10, 15, . . . ). We repeated this proce- 
dure for each of the seven species. Simulation data for 
all seven species were combined into one data set. 
Once the simulations were run, we plotted the simu- 
lated accuracy data against the number of observa- 
tions included in each subset. We separately regressed 


omission and commission error rates against number 


of detections to achieve a regression coefficient and 
standard error describing the relationship between 
model error types and number of detections. 

If the observed pattern of change in error rates with 
number of species detections is an artifact of sample 
size, the slope of a linear regression line for the field 
data should be the same as that obtained by simula- 
tion. However, if there is a rarity effect, causing the 
models of less-common species to have lower accuracy 
than more common ones, then the slope of the field 
data regression line should be greater. To test for this, 
we used a student’s t-test with the following null 
hypotheses: 


HO: Be = Bes (51.1) 


H0: Bor = Bos (51.2) 
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Figure 51.2. Maximum and minimum bounds on the possible 
values that an estimate of omission error assumes are de- 
pendent on the sample size. Sample size has been scaled to 
the percentage of points necessary to sample every individual 
in the study area. 


where b, is the slope of the regression line from the 
field observations for commission error, b, is the 
slope of the regression line from simulation for com- 
mission error, bor is the slope of the regression line 
from field observations for omission error, and bg, is 
the slope of the regression line from simulation for 
omission error. Because the simulations yielded large 
amounts of data behaving in mostly predictable pat- 
terns, the standard errors for the simulation regression 
coefficients were very small with respect to the param- 
eter estimates. Thus, for the purpose of comparison, 
we constructed our statistical tests treating the simula- 
tion results as constants (Ramsey and Schafer 1997). 
Plotting the possibilities that an estimate of omis- 
sion or commission error could attain for a given sam- 
ple size gave insight into the bounds within which 
error rates must be. To see how upper and lower 
bounds for omission error rates changed (Fig. 51.2), 
we assumed that a given model had a true omission 
error (O4), that there were a definite number of indi- 
viduals within the modeling area at a given time (N), 
and at some maximum amount of effort all individu- 
als (N) were sampled and O, obtained. For all detec- 
tions of n individuals (where n is less than or equal to 
N), omission error rates were bounded by 0.0 and 1.0 
as long at n/N is less than or equal to O,. When the 
proportion of sampled individuals (n) to the total 
number of individuals on the study area (N) exceeded 


the true omission error of the model, the upper bound 


decreased as 


Omax ONM (51.3) 


The minimum bound for omission errors remained 
0.0 as long as n/N € 1 - O,. When the proportion of 
samples individuals (n) to total individuals (N) ex- 
ceeded one minus the true omission error rate (O,), 
the lower bound increased as 


On; Oo NEN (51.4) 


When n reached N, the only value that could be ob- 
tained for estimated omission error is O,. 

To see how upper and lower limits of commission 
error rates changed with sample size (Fig. 51.3), the 
same types assumptions for omission error rate 
bounds were made (i.e., actual number of sampling 
units and true commission error rate [C,] that could 
be attained with some maximum effort). Additionally, 
the total number of predictions made (P) and the true 
omission error rate (O,) must be known. The mini- 
mum bound for commission error rates originated at 
1.0 for n = 0 and decreased linearly until estimated 
commission error reached C,. The maximum commis- 
sion error rate bound was 1.0 until n/N exceeds the 
omission error rate when it decreased linearly at the 
same rate as the minimum bound until C, was 
reached. The greatest difference between the maxi- 
mum and minimum bounds for commission error 
rates was O,. 


Results 


The graph of commission error by number of detec- 
tions (Fig. 51.4a) showed a strong negative trend as 
sample sizes increased across all species (R2 = 0.9861: 
P«« 0.0001) and behaved as predicted (Fig. 51.3). The 
regression line intercept was approximately equal to 1 
(i.e., no observations necessitates total commission 
error). Commission error rates decreased 0.1 (or 10 
percent) for every 167 observations. Five species had 
commission error rates less than predicted by the re- 
gression line (western meadowlark [see Appendix for 
scientific names and number of detections], spotted 
towhee, yellow warbler, song sparrow, warbling 
vireo). Omitting the seven species included in the sim- 
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Figure 51.3. Maximum and minimum values that estimates of 
commission error assume are dependent on sample size and 
the actual omission error rate of the model. Sample size has 
been scaled to the percentage of points necessary to sample 
every individual in the study area. 


ulation did not significantly change the regression co- 
efficient (B. = —0.0007; R2 = 0.9577; P << 0.0001) 

Omission error rates showed a statistically signifi- 
cant decrease with changes in number of detections 
(Fig. 51.4b; R2 = 0.0716; P = 0.0051). Given the low 
correlation, however, we did not consider this biologi- 
cally significant because the change was less than 
0.025 across the range of sample sizes 5 to 899. Vari- 
ation in the values of omission error rates decreased as 
sample size increased. This was in line with our pre- 
diction (Fig. 51.2). Four species had significantly 
higher omission error rates than other species with 
similar numbers of detections (yellow warbler, song 
sparrow, black-capped chickadee, warbling vireo). 
Omitting the seven species included in the simulation 
significantly changed the regression coefficient (Bor = 
—0.0009; R2 = 0.0617; P = 0.0123). We also did not 
consider this biologically significant. 

In our simulation studies, commission error rates 
decreased predictably as sample size increased (Fig. 
51.5a; R2 = 0.9973; P << 0.0001). The regression line 
intercept was equal to one. Omission error was gener- 
ally low and showed no correlation with respect to 
sample size but was statistically significant due to the 
large sample size (Fig. 51.5b; R2 = 0.0283; P « 
0.0001). Given the low correlation, we did not con- 
sider it biologically significant. Variation in the simu- 
lated omission error rates tended to decrease as sam- 


ple size increased. 


Change in Commission Error Rate with Sample Size 
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Figure 51.4. Change in error rates with number of detections 
for 108 bird species detected on the Idaho portion of U.S. For- 
est Service Northern Region. The seven species with more 
than five hundred detections (marked with dark triangles) were 
used in the simulation exercise. The dispersed nature of esti- 
mated commission (a) and omission (b) error rates obscured 
trends in the data due to sample sizes. Given that models with 
high commission error rates had low omission error and vice 
versa (indicating either over- or underprediction, correspond- 
ingly), we averaged commission and omission error rates for 
each model. Black triangles indicate the seven species in- 
cluded in the simulation. BCCH = black-capped chickadee (See 
Appendix for scientific names), WAVI = warbling vireo, YEWA = 
yellow warbler, SOSP = song sparrow, SPTO = spotted towhee, 
and WEME = western meadowlark. 


Field and Simulation Comparison 


Field estimates of commission error change with 
number of detections were not significantly different 
from simulation estimates (P = 0.1747). The slope of 
the regression line for change in field estimates of 
omission error with number of detections was signif- 
icantly less than that of simulation estimates (P = 
0.0065). Given the variability in the omission error 
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Change in Simulated Commission Error Rate with Sample Size 
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Figure 51.5. Change in error rates with simulated number of 
detections for the seven most common species detected on 
the U.S. Forest Service Northern Region. Because random sub- 
sets of observations were selected from the total observation 
set for each species, commission error decreased in a pre- 
dictable manner (a). Omission error rates were low and exhib- 
ited more variability (b). Mean error rates for the simulations in- 
dicated similar patterns in error rate change with number of 
detections as the field observations. 


data, we do not believe that this difference is biolog- 
ically significant. 


Error Rate Possibilities 


We found it was possible to account for the pattern in 
model error by changing the number of species detec- 
tions. Error rates at small sample sizes were character- 
ized by high estimates of commission error and high 
variability in omission error estimates. Commission 
error rates declined predictably with increasing num- 
ber of observations. Variability in omission error esti- 
mates also decreased with increased observations. For 
predicting presence and absence of the seven simu- 
lated species, we can estimate the true versus apparent 


error at sample sizes less than the full number of de- 
tections (assuming that commission error at the full 
number of detections is the actual commission error of 
the model). At the smallest sample size (five detec- 
tions), apparent error accounted for as much as 55 
percent of measured commission error when averaged 
over the seven simulated species. 

Simulation results suggest for forest songbirds on 
our study area, approximately 167 observations are 
needed to decrease commission error estimates by 10 
percent. Potentially, more data would be needed for 
highly confident accuracy measures than was neces- 
sary for constructing the model. However, this is un- 
doubtedly related to number of survey points versus 
area modeled. Still, this is a significant finding, as most 
accuracy assessments for wildlife-habitat models are 
either carried out with a very small number of field 
observations or not conducted at all (Salwasser and 
Krohn 1982; Morrison et al. 1998; Verbyla and Lit- 
vaitis 1989; T. C. Edwards personal communication). 
Project goals and the precision of results may need to 
be modified to fit within budgetary constraints. Thus, 
the additional expense in getting a test set of sufficient 
size may not always be possible to managers operating 
with small budgets. 

Rabinowitz et al. (1986; see also Rabinowitz 1981) 
described rarity in terms of the interaction of geo- 
graphic range, habitat specificity, and local density. 
Under this hypothesis, a species that occurred over a 
large region and in a variety of ecological conditions 
but had naturally low densities can be distinguished 
from a narrow endemic species that was strongly asso- 
ciated with localized habitat features but occurred in 
dense populations. This has important implications 
for assessing the accuracy of wildlife-habitat models. 
For habitat-general species that occurred in low densi- 
ties over large regions, commission error rates at low 
sample sizes would contain a large apparent error 
component. However, for habitat-specific species oc- 
curring in high densities over small areas, true com- 
mission error may be much greater than apparent 
model error. Boone and Krohn (1999) attempted to 
quantify the attributes associated with rarity in Maine 
birds to predict whether wildlife-habitat models could 
be expected to have high apparent error components. 

The intermountain northwest of the United States 
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has relatively few endemic bird species (AOU 1998). 
Therefore, the species that we detected infrequently 
would most likely fit into the category of broad-range, 
low-density species (after Rabinowitz et al. 1986). Ad- 
ditionally, simulation of rarity by random subsam- 
pling of a data set would tend to produce distributions 
equivalent to that of a broad-range, low-density 
species. We then would not expect the models for 
most species we detected infrequently to perform any 
worse than more-abundant species. However, more re- 
search should be directed toward the effects of other 
factors contributing to rarity (i.e., geographic range, 
habitat specificity). 

Given that the presence or absence of a species is 
related to habitat features that are easily mapped, it 
is plausible that the ability to correctly model species 
occurrence could be as much a function of how 
much is known about the species as it is a function of 
factors contributing to rarity. In the case of a species 
with a limited geographic range, incomplete knowl- 
edge as to the extent of its range could result in 
higher commission error. For widely distributed 
species occurring at low densities, apparent model 
error is likely very high given the difficulty in collect- 
ing sufficient observations. However, often more is 
known about the habitat associations and ranges of 
the rarest species than many common ones. There- 
fore, small sample sizes preclude reliable estimates of 
accuracy of habitat-relationship models for many 
rare species. 

To the manager using habitat-relationship models 
to aid decision-making, this means that reported accu- 
racies could be misleading. We do not advocate that 
effort should not be spent toward assessing model ac- 
curacy. Assessment with even the smallest sample size 
can give some information about model performance. 
However, the results of such calculations should be 
viewed with extreme caution since actual error rates 
could by above or below what is estimated. 


Appendix 


Common and scientific names for species detected on 
U.S. Forest Service's Northern Region (Region 1), and 
the number of sites at which each species was detected. 


5m9 
No. of 
detection 
Common name Scientific name sites 
Mallard Anas platyrhynchos D 
Common merganser Mergus merganser 2 
Osprey Pandion haliaetus 2 
Sharp-shinned hawk Accipiter striatus 4 
Cooper's hawk Accipiter cooperii al 
Northern goshawk Accipiter gentilis 5 
Red-tailed hawk Buteo jamaicensis 8 
American kestrel Falco sparverius 12 
Blue grouse Dendragapus obscurus 2 
Ruffed grouse Bonasa umbellus 84 
Wild turkey Meleagris gallopavo 3 
California quail Callipepla californica dL 
Spotted sandpiper Actitis macularia 1 
Common snipe Gallinago gallinago 3 
Mourning dove Zenaida macroura 9 
Barn owl Tyto alba 1 
Common poorwill Phalaenoptilus nuttallii 1 
Vaux’s swift Chaetura vauxi 2 
White-throated swift Aeronautes saxatalis al 
Calliope hummingbird Stellula calliope 9 
Broad-tailed hummingbird Selasphorus platycercus 2 
Rufous hummingbird Selasphorus rufus 59 
Belted kingfisher Ceryle alcyon 7 
Lewis’s woodpecker Melanerpes lewis 4 
Williamson’s sapsucker Sphyrapicus thyroideus 8 
Red-naped sapsucker Sphyrapicus nuchalis 122 
Downy woodpecker Picoides pubescens T 
Hairy woodpecker Picoides villosus 63 
Three-toed woodpecker Picoides tridactylus 8 
Black-backed woodpecker Picoides arcticus al 
Northern flicker Colaptes auratus 1412 
Pileated woodpecker Dryocopus pileatus 42 
Olive-sided flycatcher Contopus cooperi 56 
Western wood-pewee Contopus sordidulus alal 
Willow flycatcher Empidonax traillii 34 
Hammond's flycatcher Empidonax hammondii 302 
Dusky flycatcher Empidonax oberholseri 247 
Cordilleran flycatcher Empidonax occidentalis 23 
Violet-green swallow Tachycineta thalassina 1 
Barn swallow Hirundo rustica Al 
Gray jay Perisoreus canadensis 94 
Steller's jay Cyanocitta stelleri 67 
Clark's nutcracker Nucifraga columbiana 5 
American crow Corvus brachyrhynchos dl 
Common raven Corvus corax Alls) 
Black-capped chickadee Poecile atricapilla aL ial 


(continues) 
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Appendix. (Continued) 


No. of No. of 

detection detection 
Common name Scientific name sites Common name Scientific name sites 
Mountain chickadee Poecile gambeli 165 American redstart Setophaga ruticilla 10 
Boreal chickadee Poecile hudsonica dL Northern waterthrush Seiurus noveboracensis f 
Chestnut-backed chickadee Poecile rufescens 385 MacGillivray’s warbler Oporornis tolmiei 719 
Red-breasted nuthatch Sitta canadensis 635 Common yellowthroat Geothlypis trichas 8 
White-breasted nuthatch © Sitta carolinensis 40 Wilson’s warbler Wilsonia pusilla 158 
Pygmy nuthatch Sitta pygmaea 4 Western tanager Piranga ludoviciana 444 
Brown creeper Certhia americana 74 Black-headed grosbeak Pheucticus 
Rock wren Salpinctes obsoletus dl melanocephalus 128 
Canyon wren Catherpes mexicanus al Lazuli bunting Passerina amoena 104 
House wren Troglodytes aedon 102 Spotted towhee Pipilo maculatus 74 
Winter wren Troglodytes troglodytes 335 Chipping sparrow Spizella passerina 273 
American dipper Cinclus mexicanus 14 Savannah sparrow Passerculus 
Golden-crowned kinglet Regulus satrapa 740 sandwichensis 3 
Ruby-crowned kinglet Regulus calendula 120 Fox sparrow Passerella iliaca 147 
Western bluebird Sialia mexicana 1 Song sparrow Melospiza melodia 156 
Mountain bluebird Sialia currucoides 13 Lincoln’s sparrow Melospiza lincolnii i 
Townsend's solitaire Myadestes townsendi VS White-crowned sparrow Zonotrichia leucophrys 8 
Veery Catharus fuscescens al Dark-eyed junco Junco hyemalis 899 
Swainson's thrush Catharus ustulatus 524 Red-winged blackbird Agelaius phoeniceus 1 
Hermit thrush Catharus guttatus 33 Western meadowlark Sturnella neglecta 9 
American robin Turdus migratorius 439 Brewer's blackbird Euphagus cyanocephalus 2 
Varied thrush Ixoreus naevius 187 Brown-headed cowbird Molothrus ater: 120 
Gray catbird Dumetella carolinensis 4 Bullock's oriole Icterus bullockii al 
Cedar waxwing Bombycilla cedrorum 34 Pine grosbeak Pinicola enucleator 5 
European starling Sturnus vulgaris al Cassin’s finch Carpodacus cassinii 43 
Plumbeous vireo Vireo cassinii 327 Red crossbill Loxia curvirostra 45 
Warbling vireo Vireo gilvus SeT White-winged crossbill Loxia leucoptera 4 
Red-eyed vireo Vireo olivaceus Ih Pine siskin Carduelis pinus 202 
Orange-crowned warbler Vermivora celata 127 American goldfinch Carduelis tristis 4 
Nashville warbler Vermivora ruficapilla 100 Evening grosbeak Coccothraustes 
Yellow warbler Dendroica petechia 129 vespertinus 59 
Yellow-rumped warbler Dendroica coronata 678 House sparrow Passer domesticus 2 


Townsend's warbler 


Dendroica townsendi 
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Spatial Analysis of Stopover Habitats of 
Neotropical Migrant Birds 


Scott M. Pearson and Theodore R. Simons 


ecent declines in populations of Neotropical 

landbird migrants (Robbins et al. 1989b; Askins 
et al. 1990; Finch 1991) have prompted a wave of 
new research into the factors affecting populations of 
these birds on their breeding and wintering grounds 
(Hagan and Johnston 1992; Finch and Stangel 1993). 
Although breeding and wintering activities are essen- 
tial for population persistence, migration represents a 
significant event in the yearly life cycle of these birds 
because it requires a large energetic investment and 
can represent a period of high mortality. Moreover, 
areas used by migrating birds for rest and foraging 
(i.e., stopover areas) are experiencing rapid changes 
due to increasing urban, residential, and industrial/ 
agricultural development. Increases in the abundance 
of these habitats may be detrimental to migrants 
(Yong et al. 1998). Some studies have focused on the 
factors affecting birds during migration (Moore and 
Simons 1992; Winker et al. 1992; Watts and Mabey 
1993; Morris et al. 1994; Simons et al. 2000; Moore 
et al. 1995) and are shedding light on how stopover 
areas can be critical for many migratory species. De- 
signing conservation-oriented studies of the stopover 
ecology of migrants is complicated by the fact that mi- 
gration occurs over a broad geographic area, but over 
a relatively short time period. It is often difficult for 
field researchers to be in the right place at the right 


time. 


Remote-sensing technology and spatial-modeling 
techniques are providing new research tools for inves- 
tigating how the distribution and abundance of habi- 
tats may affect wildlife populations. Metrics of land- 
scape pattern provide a means to quantify differences 
between landscapes. Landscape indices and spatial 
models (e.g., habitat suitability, individual-based mod- 
els) are tools that allow biologists to assess how differ- 
ences in the abundance and spatial arrangement of 
habitats may affect the suitability of landscapes for se- 
lected species. Moreover, these models can be modi- 
fied to assess the quality of landscapes for species with 
different habitat needs, physiological requirements, or 
foraging strategies (Dunning et al. 1995; Turner et al. 
1995). Conclusions gleaned from these models, how- 
ever, need to be interpreted within the context of the 
model design and assumptions. 

We have recently discussed how spatial models can 
be applied to questions about the stopover ecology of 
trans-Gulf migrants (Simons et al. 2000). We have 
shown how models incorporating available data on 
the arrival condition of migrants, energetic and mor- 
phological constraints on movement, and species- 
specific habitat preferences can provide insights into 
how the abundance, quality, and spatial pattern of 
habitats interact with the arrival energetic state of mi- 
grants to determine the suitability of migratory 
stopover habitats along the northern Gulf Coast. Our 
goal in this chapter is to explore further how an 
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analysis of landscape composition and spatial models 
that distinguish between habitat specialists and gener- 
alists can improve our understanding of the factors 
constraining migrants at stopover sites. We hope that 
the results of this analysis will provide insights that 
are useful in setting priorities for future research and 
conservation. 

This research will compare five study landscapes lo- 
cated on the northern coast of the Gulf of Mexico 
using (1) landscape-level metrics of spatial pattern, 
and (2) output from an individual-based model of 
stopover habitat use. Landscape-level metrics provide 
a means to quantify the abundance and spatial pattern 
of habitat types in study landscapes (Turner and Gard- 
ner 1991). The most straightforward measure is the 
area of suitable habitat types. The spatial arrangement 
of habitats can also be measured. For example, habitat 
fragmentation is often quantified using a combination 
of the number of habitat patches, mean patch size, 
and area of the largest patch. The juxtaposition of 
habitat types can be measured with indices of edge 
density or contagion (Hargis et al. 1997). Landscape- 
level metrics provide objective measures of habitat 
patterns; however, interpreting these metrics from the 
perspective of a species’ biology remains challenging. 

Field studies have shown that the arrival energetic 
state of migrants at trans-Gulf stopover sites is highly 
variable (Moore et al. 1990; Moore and Simons 1992; 
Morris et al. 1994; Moore et al. 1995; Fransson and 
Jakobsson 1998). Energy reserves can constrain long- 
and short-range movements of migrants (Jenni and 
Jenni-Eiermann 1998). Measurements of fat reserves 
of birds arriving along the northern Gulf Coast have 
been used with flight performance models to estimate 
the potential flight ranges of migrants at coastal 
stopover sites (Simons et al. 2000). Flight ranges may 
be as little as a few kilometers in extreme cases and 
average from tens to several hundred kilometers in 
most species. Thus, the distribution, abundance, and 
quality of suitable habitat within the range of mi- 
grants at coastal stopover sites can be viewed as an 
important constraint on the likelihood of a successful 
migration. Habitat preferences have been demon- 
strated for migrants during stopover (e.g., Weisbrod et 
al. 1993; Russell et al. 1994). Along the Gulf Coast, 
field studies have shown that migrants appear to pre- 


fer forests with a well-developed understory and ripar- 
ian bottomlands over other habitats available at 
stopover sites (Moore et al. 1990; Simons et al. 2000). 
These observations suggest habitat specialization may 
also constrain the suitability of stopover sites for some 
species. 

We have developed an individual-based model that 
simulates habitat usage and energy gain by birds dur- 
ing migration stopover events. The model is simplistic 
and makes minimal assumptions about the details of 
habitat use. It is designed to provide information 
about the consequences of foraging in landscapes that 
vary with respect to the abundance, spatial arrange- 
ment, and quality of habitats that differ with respect 
to their foraging returns. Changing the model parame- 
ters also allows us to simulate birds that vary in their 
energy states and habitat specificity (e.g., generalists 
versus specialists). 

The specific objectives of this work were (1) to in- 
vestigate the relative importance of landscape pattern 
of habitats, energetic state of arriving birds, and habi- 
tat specialization for successful stopover, and (2) to 
identify the types of landscapes that will provide suit- 
able stopover habitat. This second objective was im- 
plemented by ranking a set of real landscapes accord- 
ing to their suitability for stopover habitat use. To 
achieve these objectives, we analyzed data from re- 
mote sensing using modeling and analytical ap- 
proaches based on an understanding of stopover ecol- 
ogy gained from field studies. 


Study Area and Methods 


The study landscapes were five 25x25-kilometer re- 
gions located along the northern Gulf Coast in the 
states of Texas, Louisiana, and Mississippi (see Fig. 
52.1 in color section). Habitat data for these land- 
scapes were derived from a supervised classification of 
two 1990 Landsat Thematic Mapper images. The 
classification was performed by the U.S. Geological 
Survey’s Southern Science Center in Lafayette, 
Louisiana. The original map consisted of eighteen 
cover types in raster format having square 28.5x28.5- 
meter cells. These eighteen habitat types were reclassi- 
fied into four habitat categories that represent four 
classes of habitat quality (Table 52.1). Category 1 rep- 
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TABLE 52.1. 


Habitat categories used for landscape suitability analyses. 


Category 12 


Category 2 


Category 3 Category 4^ 


Deciduous forest 
Bottomland forest 


Mixed shrub-scrub 
Evergreen shrub-scrub 
Mixed forest 


Unclassified Emergent marsh 
Water Residential 
Excavated soil Pine forest 
Beach/sand Cropland 
Sand bar Orchards 
Commercial 
Transportation 
Industrial 
96 Totalc Es 53.6 


31.4 TS 


aCategory 1 habitats offer few opportunities for foraging; therefore, this category represents the poorest habitats for 


migrants. 


bCategory 4 habitats were assumed to be the best habitats for migrants. 
cThe percent representation of each of the categories, summed over all landscapes, is reported. 


resented the poorest-quality habitat class. These habi- 
tats provide few if any opportunities for foraging by 
migrating landbirds. Category 4 represents the high- 
est-quality habitats that provide abundant food re- 
sources. Assignment of habitats to these classes was 
based on field studies of habitat usage and habitat-spe- 
cific weight gain conducted during 1987-1994 
(Moore et al. 1990; Kuenzi et al. 1991; Simons et al. 
2000). 


Landscape Metrics 


For each study landscape, the spatial pattern of the 
four habitat categories was quantified using 
FRAGSTATS (McGarigal and Marks 1995). For each 
habitat category, the following metrics were recorded: 
percentage of total area of map occupied by each 
habitat category, total number of patches of all habitat 
types, patch density (number of patches per hectare), 
mean patch size (hectare), edge density (meters of edge 
per hectare), Simpson’s index, and contagion. Propor- 
tion of total area serves a measure of the abundance of 
the habitat category. Collectively, the number of 
patches, patch density, and mean patch size provide a 
means to compare the relative fragmentation or con- 
nectivity of a given habitat type among the five study 
landscapes. Landscapes in which habitats are more 
fragmented have a greater number of patches, higher 


patch density, and smaller mean patch size. Edge den- 
sity and contagion measure the degree of interspersion 
and contact between habitat types. Edge density will 
increase in landscapes in which patch sizes are small 
and patches have complex or elongated shapes (e.g., 
skinny rectangles or dendritic shapes) rather than 
compact shapes (e.g., circles or squares). Contagion 
also measures the degree of habitat fragmentation and 
contact between habitat types. The contagion value 
represents the probability that two adjacent cells, cho- 
sen at random, will be of the same habitat type. Thus, 
contagion will be greater for landscapes in which 
habitats are highly clumped and lower for maps in 
which habitats are fragmented and highly inter- 
spersed. See McGarigal and Marks (1995) for a more 
complete description of the calculation and interpreta- 
tion of these landscape metrics. 

In addition to the metrics provided by 
FRAGSTATS, we calculated an index of general land- 
scape quality using the following formula: 


Quality = P4 + 2*P3 + 3*P3 + 4*P4 


where P, represents the proportion of the total land- 
scape area occupied by habitat category x. Landscapes 
with a greater proportion of high-quality habitat types 
will receive a greater-quality score. Although this 
index will permit the ranking of landscapes with 
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respect to the abundance of habitat categories, it does 
not indicate anything about the spatial arrangement of 
habitats within the landscape. This index uses an 
econometric approach similar to that of a habitat suit- 
ability index (HSI). In HSI techniques, each habitat 
unit (usually an areal unit) is multiplied by an ordinal 
index of habitat quality. 


Individual-based Model 


This model uses an energy state index (ESI) to indi- 
cate the relative energetic state of birds during migra- 
tory stopover. We assume that birds arrive at the 
stopover site with low energy reserves. During 
stopover, an individual bird forages to improve its 
energetic state, rebuilding energy reserves that will 
fuel its next migratory flight (e.g., Alerstam and 
Lindstrom 1990; Fransson 1998). The species that 
inspired this study typically migrate by flying long 
distances (usually hundreds of kilometers per flight) 
at night. At the end of one of these long-distance 
flights, the birds *stop over" and spend one or more 
days foraging to refuel before the next long-distance 
flight. The availability of habitats that provide forag- 
ing opportunities at the stopover site will determine 
the rate at which energy reserves can be rebuilt. Birds 
that land in relatively rich sites will be able to refuel 
quickly; those that land in poorer sites will take 
longer to store enough energy to make another long- 
distance, nocturnal flight (Yong et al. 1998). In a 
worst case, a migrant in a very poor site may starve 
because it cannot ingest enough energy to satisfy its 
immediate energetic needs. 

The model incorporates these assumptions about 
energetic state in the following way. Arriving birds 
with a given ESI land in a randomly determined cell in 
the habitat map. By changing the initial ESI value, we 
simulated birds that have varying levels of energy 
upon arriving at the stopover site. The bird then for- 
aged by moving from cell to cell. At each cell, ESI is 
updated using this equation: 


ESI; = (energy gained in cell;) — 


(energy cost of movement and foraging). 


Foraging cost was held constant, and foraging gain 
accrued by the birds as they moved across the land- 
scape depended on the habitat category of each cell 


TABLE 52.2. 


Energetic gain values by habitat categories for habitat 
generalists and specialists. The habitat categories are defined 
in Table 52.1. 


Habitat category 


1 2 3 4 
Generalist 0.10 (055 (QE 0.75 
Specialist 0.00 0.10 0.45 1.50 


encountered. In productive habitats, migrants experi- 
enced a net energy gain (i.e., gain is greater than cost). 
In the poorest habitats, there was a net loss (gain is 
less than cost). 

The degree of ESI gain or loss depended on whether 
the bird was classified as a habitat generalist or a habi- 
tat specialist (Table 52.2). For both generalist and spe- 
cialist, the cost of foraging and moving was fixed at 
0.50 ESI units for all habitat types. The habitat- 
specific gain values were adjusted so that both general- 
ists and specialists would receive the same gain if they 
encountered equal proportions of the four habitat cat- 
egories. However, the specialist did better than the 
generalist on category 4 cells and worse on category 2 
and 3 cells. l 

In the model, the bird continued to forage until its 
ESI crossed one of two thresholds. If the individual 
gained enough energy, it left the study landscape on 
another long-range migratory movement. In contrast, 
individuals that failed to find productive habitat con- 
tinually lost energy and died (if the ESI dropped too 
low). When an individual migrated or died, the num- 
ber of cells visited was recorded in the model output. 
For these simulations, the migration threshold was 
fixed at an ESI of 30.0. The death threshold was set at 
an ESI of 2.0. Thresholds were the same for both 
habitat generalists and specialists. 

A subroutine that governs movement from cell to 
cell was used that incorporated knowledge of adjacent 
cells, ability to choose among cells based on habitat 
quality, and a northerly bias to movement. When 
moving from cell to cell, the individual was assumed 
aware of the habitat types of the adjacent eight cells. 
The bird would choose cells with higher ESI gain val- 
ues (i.e., higher habitat-quality category) over cells 
with lower ESI gain values. Laboratory studies have 
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TABLE 52.3. 


Coefficients used to incorporate a northerly bias to bird 
movement during stopover. 


North 
0.90 1.00 0.90 
West 0m5 Focal Cell ORS East 
0.60 0.50 0.60 
South 


demonstrated a tendency for birds migrating through 
the northern Gulf in the spring to orient and move 
northward (Gauthreaux 1971; Emlen 1975). There- 
fore, we assumed that given two cells of equal habitat 
quality, a bird is more likely to move to the more 
northerly cell. This bias was accomplished by dis- 
counting the gain value for each cell by its position 
relative to north. Thus, the bird would calculate an at- 
tractiveness value for each cell: 


Attractiveness = (ESI gain)* (nbias) 


where nbias is this discounting coefficient. Table 52.3 
shows the nbias coefficients used in these simulations. 
Birds would move to the cell with the greatest attrac- 
tiveness value. If two cells were equal in greatest at- 
tractiveness, choice between them would be made ran- 
domly. Birds were not allowed to return to cells that 
were previously visited. Arriving birds were randomly 
located in the southern portion of the map to a cell 
within 1.0 kilometer of the Gulf Coast. Data for birds 
that wandered to the edge of the map without migrat- 
ing or dying were discarded, and the simulation was 
reinitiated for that individual. 


Simulation Experiment 


A set of simulations was designed to assess the per- 
formance of birds using the five study landscapes as 
potential stopover sites. The goal of this experiment 
was to determine the relative importance of landscape 
pattern of habitats, arrival energetic state of arriving 
birds, and species’ habitat specialization to stopover 
performance. Performance was measured by (1) the 
proportion of birds that survived and migrated, and 
(2) the number of cells visited by migrating birds. In 
highly suitable landscapes, it is expected that a greater 
proportion of birds will migrate and that fewer cells 


DIS 


will be visited during the stopover time because energy 
gain per cell will be greater on average. However, the 
actual performance of the birds could be affected by 
the patchiness and interspersion of habitat types. 

A factorial design was used with the following lev- 
els: (1) five study landscapes, (2) two levels of habitat 
specialization, and (3) four levels of arriving ESI. The 
levels of arrival ESI were set at 5, 10, 15, and 20. For 
each of forty treatment combinations, five hundred 
replicate birds were simulated. The relative influence 
of landscape; arrival ESI, and habitat specialization 
were compared by examining the magnitude of F 
scores from the analyses of variance. ANOVAs were 
conducted using SAS (1985). To improve normality, 
the proportions of migrants were arcsine-square-root 
transformed and counts of cells were square-root 
transformed before conducting the ANOVAs (Sokal 
and Rohlf 1995) . 


Results 


Differences among the five study landscapes evident in 
Figure 52.1 were also revealed in the landscape met- 
rics, although variation in the selected metrics was not 
striking. Comparing the relative abundance of each 
habitat type is the simplest way to compare the five 
landscapes. Landscape A had the greatest amount of 
the highest-quality habitat, category 4 (Fig. 52.2). 


Abundance of Habitats 


Percent of map 


B 
Study Landscape 


Habitat Category [] 1 2 a 3 EA 4 


Figure 52.2. The relative abundance of four habitat types in 
each of the five study landscapes. 
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TABLE 52.4. 


Landscape metrics for five study landscapes. 


Landscape 

Metric? A B c D E 

Number of patches 32,426 3217383 31,059 29,491 31,463 
Patch density (no./ha) 51.9 502 49.7 47.2 50.4 
Mean patch size (ha) 1.93 99 29 22302) 1.97 
Edge density (m/ha) 196.7 194.7 199.3 188.7 192.0 
Simpson’s index 0.69 0.71 0.68 0.73 0.60 
Contagion (%) 34.8 34.0 339 31.8 40.9 
Quality index? 257 DET PASH 2.47 2.70 


aSee McGarigal and Marks (1995) for a complete description of the first six metrics. 
‘The quality index is described in the methods section of this chapter. 


Landscape E had the greatest amount of habitat cate- 
gory 3 and the greatest amount of habitats 3 and 4 
combined. There was little difference in the abun- 
dance of high-quality habitat among landscapes B, C, 
and D. Among those three study areas, landscape C 
had the least amount of poor-quality, category 1 habi- 
tat. Landscape D had the greatest value of Simpson’s 
index (Table 52.4) indicating that the habitat cate- 
gories were most evenly distributed in this map com- 
pared to the other five maps. Landscape E had the 
lowest value of this index due the greater relative 
abundance of category 3 habitats. 

The remaining metrics provide measures of the spa- 
tial pattern of the four habitats and their intersper- 
sion. The number of patches, patch density, and mean 
patch size provide a means to compare the relative lev- 
els of fragmentation among the five maps. Landscape 
A had the greatest number of patches, largest patch 
density, and smallest mean patch size (Table 52.4), 
which indicates that this map had the greatest degree 
of habitat fragmentation. In contrast, landscape D had 
the lowest number of patches, least patch density, and 
largest mean patch size, indicating that habitats in this 
landscape tended to be more clumped in larger 
patches (landscapes A and D, Fig. 52.1). 

The interspersion of habitats may be important for 
birds as they move across the landscapes. Although in- 
terspersion is related to measures of habitat fragmen- 
tation, the metrics of edge density and contagion pro- 
vide information on the likelihood that a moving bird 
will encounter different habitat types during stopover. 


Landscape C had the greatest edge density (Table 
52.4) due to the high interspersion of habitats and the 
complex shapes of habitat patches evident in northern 
portion of this map (Fig. 52.1). This landscape had an 
intermediate level of contagion (Table 52.4) due to the 
differences between the northern and southern sec- 
tions of this map. Landscapes B, C, and D show a 
sharp gradient in the pattern of category 4 habitats. 
These habitats, which include deciduous and bottom- 
land forests, become more abundant in wetland areas 
that are farther away from the brackish water influ- 
ence of the Gulf of Mexico. Landscape E had the high- 
est level of contagion and an intermediate value of 
edge density (Table 52.4) caused by the increased cov- 
erage of category 3 habitats in this map (Fig. 52.2). 
This habitat dominates much of the middle portions 
of this landscape (Fig. 52.1). 


Probability of Successful Stopover 


Highly suitable landscapes would be expected to 
have a high proportion of birds that acquire enough 
energy to migrate after stopover. The relative influ- 
ence of landscape, arrival ESI, and habitat specializa- 
tion were compared by examining the magnitude of 
F scores from the analysis of variance. Habitat spe- 
cialization had the greatest effect on probability of 
successful migration (Table 52.5). Habitat generalists 
had consistently higher probabilities of successful 
stopover than did specialists (Fig. 52.3). The success 
of specialists varied among landscapes (Fig. 52.3), 
resulting in a weaker, although significant landscape 


52. Spatial Analysis of Stopover Habitats of Neotropical Migrant Birds 587 


TABLE 52.5. 


Analysis of variance of the proportion of birds surviving to 
migrate. 


Source df Type lll SS F P 
Landscape 4 0.458 9/7959 <0 00T 
Arrival ESI Al 0.200 60.0 «0.001 
Habitat specialization ji 0.659 198.2  « 0.001 
Landscp.xESI 4 0.012 0.9 0.457 
Landscp.xSpecialization 4 0.679 51:1. 3< 01004 
ESIxSpecialization 1: 0.001 0.4 0.550 
Total 39 6.686 


Note: Proportions were arcsine-square root transformed before 
analysis. 


main effect and landscape x specialization interac- 
tion. Among the main effects, arrival ESI had an in- 
fluence of intermediate magnitude. Higher values of 
arrival ESI resulted in a greater chance of successful 
stopover, especially for habitat specialists (Fig. 52.3). 
There were no significant interactions involving ar- 
rival ESI. 


Number of Cells Visited by Migrants 


Among the main effects, the strongest influences were 
arrival ESI and landscape pattern (Table 52.6). In- 
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Figure 52.3. The proportion of simulated birds obtaining 
enough energy to survive and migrate out of the study land- 
scape. Results for the five landscapes are plotted in separate 
groups of bars. Two types of birds were simulated: habitat 
specialists and habitat generalists. Each bird began its 
stopover habitat use with one of four levels of energy reserves. 
These reserves were tracked in the individual-based model 
using a energy state index (ESI). 


creasing arrival ESI tended to reduce the number of 
cells visited by successful migrants (Fig. 52.4). Land- 
scapes A and E (mean + SD respectively: 96.0 + 74.9, 
103 + 778.2) required fewer cells for successful migra- 
tion than landscapes B, C, and D (136.7 + 144.4, 
136.6 + 110.1, 159.3 + 146.0, respectively). The effect 
of habitat specialization was of intermediate magni- 
tude. However, there were strong interactions between 
this factor, landscape, and arrival ESI. Arrival ESI had 
a stronger influence on generalists than on specialists 
(Fig. 52.4). The number of cells visited was consis- 
tently reduced for generalists with higher levels of en- 
ergy on arrival; the influence of this factor on special- 
ists was less consistent among the landscapes. 
Specialists visited more cells than generalists did in 
four of the five landscapes. Specialists visited fewer 
cells than generalists did in landscape A. The higher 
efficiency of specialists in this landscape was likely due 
to the greater abundance and dispersion of category 4 
habitats in this map. 


Ranking Study Landscapes 


The landscape metrics and output from the individual- 
based model allowed the ranking of landscapes. The 
most straightforward means was to use the quality 
index (Table 52.4) that is similar to HSI approaches. 
Although this measure is not spatially explicit, it pro- 
vided an initial intuitive assessment of the relative 
abundance of high-quality habitats among the alterna- 
tive landscapes. Based on the quality index, land- 
scapes E and A ranked highest; landscapes C and B 


TABLE 52.6. 


Analysis of variance of the number of cells visited by 
successful migrants. 


—————————————————————————————————— 


Source df Type Ill SS F P 
Landscape 4 1 (sdk 37.4  « 0.001 
Arrival ESI al 11,544.55 985.6 <0.001 
Habitat specialization al 1526047 107.6 < 0001 
Landscp.xESI 4 885.0 18.9 < 0.001 
Landscp. 

xSpecialization 4 20,044.1 427.8 < 0.001 
ESIxSpecialization T 4,766.2 406.9 < 0.001 
Total 14,679  234,342.3 


aCell counts were square root transformed before analysis. 
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300 + Landscape A Landscape B — Landscape C Landscape D Landscape E 


Cells visited by migrants 
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Figure 52.4. The number of cells visited by birds that survived 
and migrated from study landscape. See Figure 52.3 caption 
for explanation of format. 
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ranked the worst (Table 52.7). The high rank of land- 
scape E was produced by its greater relative abun- 
dance of category 3 habitats. 

Output from the individual-based model provided 
information on the relative success rates of migrant 
birds using these landscapes. The proportion of indi- 
viduals surviving to migrate and the number of cells 
visited by successful migrants served as measures of 
success and were used to rank the landscapes. High 
proportions of successful migrants and low numbers 
of cells visited were taken to represent the best land- 
scapes. Based on proportion of migrants, landscapes E 
and A ranked best; landscape C ranked worst (Table 
52.7). Landscapes B and D had similar proportions of 


TABLE 52.7. 


Ranking of landscapes based on quality index, proportion of 
successful migrants, and number of cells visited by 
successful migrants. 


Landscape designation? 


Quality Proportion Cells 
Rank index? migrants visited 
1 (best) highest E most E (0.980) least A (95.0) 
2 A A (0.854) EX 97) 
S D B (0.662) C (136.6) 
4 C D (0.638) B (136.7) 
5 (worst) lowest B least C(0.535) most D (159.3) 


aMean proportion of successful migrants and mean number of cells 
visited are reported in parentheses after landscape number in the 
respective columns. 

bThe quality index is reported in Table 52.4. 


migrants and were intermediate in rank. Based on the 
number of cells visited, landscapes A and E again 
ranked best; landscape D was worst. Landscapes C 
and B had almost identical numbers of cells visited 
and had an intermediate rank. 

In summary, the precise ranking of landscapes de- 
pended on the method employed. Although land- 
scapes A and E consistently ranked as the best, the 
ranking of remaining landscapes varied with method. 
For example, landscape D ranks as third best with re- 
spect to the quality index but fourth or worst accord- 
ing to the individual-based model. 


Discussion 


Transcontinental migration is a brief period in the an- 
nual cycle of a Neotropical migrant bird (i.e., a couple 
weeks during the spring and autumn). However, suc- 
cessful completion of migration between wintering 
and breeding grounds is essential to an individual's 
evolutionary fitness and for the persistence of the pop- 
ulation as a whole. Successful migration depends on 
the existence of suitable stopover habitat along migra- 
tion routes. The availability and quality of this habitat 
are as crucial as the quality of wintering and breeding 
sites. Although stopover sites are used only during a 
brief period of the year, the existence of high-quality 
sites is necessary for the continued persistence of these 
avian species. 


Conservation of Stopover Sites: 
A Challenging Problem 


The conservation strategy that led to the present sys- 
tem of waterfowl reserves along the Gulf Coast may 
not work for migrant songbirds. There are fifteen na- 
tional wildlife refuges and two national seashores that 
protect coastal habitat along the northern Gulf Coast. 
These areas were established primarily for the purposes 
of waterfowl conservation and recreation. In large 
part, the remaining coastal cheniers and riparian 
woodlands that are important to trans-Gulf migrants 
are unprotected and not in the public trust. The water- 
fowl refuges were selected because specific locations 
used by these birds could be identified. Providing habi- 
tat for migrating songbirds is not as straightforward. 
The geographic extent and stochastic nature of 
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stopover events presents a challenge for the purposes 
of maintaining populations of migrating songbirds. 
For example, on the Gulf Coast the exact location of 
“fall out” events is determined by onshore weather en- 
countered when the birds reach the coast and the 
weather patterns experienced during their flight over 
the Gulf of Mexico. Groups of migrant birds may land 
just about anywhere in a given landscape used for 
stopover during the course of several seasons. The 
conservation manager is then faced with the difficulty 
of devising ways to protect a set of species that are de- 
pendent on a particular landscape but whose use of 
the landscape is geographically random and seasonally 
ephemeral. Moreover, the manager has little or no 
control over a vast portion of the landscape that is 
owned and used by a large number of private 
landowners. These challenges to protection efforts 
may seem insurmountable. Nevertheless, we are opti- 
mistic that feasible strategies for preserving an essen- 
tial amount and geographic distribution of stopover 
habitat can be developed if scientists develop a better 
understanding of stopover ecology. 

The same characteristics of stopover events that 
discourage managers (i.e., geographically extensive, 
randomly located, temporally ephemeral) also con- 
found research scientists seeking to conduct field stud- 
ies. Some of these difficulties may be overcome by the 
increasing availability and accuracy of remotely 
sensed data covering large areas. Maps derived from 
these data allow scientists to examine the relative 
abundance and spatial pattern of habitats over broad 
geographic areas. By combining these maps with field 
studies, researchers should develop a better under- 
standing of the characteristics of landscapes that pro- 
vide quality stopover sites. 


Field Studies at Stopover Sites and 
Landscape-level Analyses 


The analytical and modeling approaches used in this 
study were developed from the findings of field stud- 
ies. Although limited in geographic and temporal ex- 
tent, field studies provide information about the ecol- 
ogy of individual and small groups of birds using 
stopover habitat. These studies provide information 
on the important features of habitat use such as 
coarse-grained and microhabitat preferences, duration 


of stay at a given site, patterns of movement within a 
site, energetic condition of arriving birds, and qualita- 
tive comparison of relative rates of energy gain among 
alternative sites. The vegetation types and microhabi- 
tats available at a given study site may be quantified, 
but researchers must rely on accurate maps of sur- 
rounding areas to learn about the abundance of habi- 
tats in the landscape surrounding the site. Habitat 
maps permit the study of research questions related to 
the influence of the surrounding landscape on the use 
of a given stopover site (e.g., Pearson 1993). For ex- 
ample, does the use of a given 5-hectare study plot of 
high-quality habitat depend on the abundance of that 
habitat in the surrounding landscape? 

Other questions about landscape-level habitat uses 
are more difficult (but not impossible) to address with 
field studies. For example, what is the scale of habitat 
selection conducted by arriving birds? Once the bird 
lands, what aspects of the landscape affect its pattern 
of movement among habitat patches? The geographic 
extent of migration makes it difficult to design com- 
prehensive field studies. However, we should be able 
to explore these questions using landscape analyses 
and spatial models that are based on our understand- 
ing of stopover ecology gained from past and present 
field studies. These explorations should lead to predic- 
tions and hypotheses that could be tested in future, 
carefully designed field studies. The approach taken in 
our study was to use a model to rank landscapes ac- 
cording to the “success” of migrants during stopover. 
The measures of success included the duration of stay 
(i.e., number of cells visited by migrants). 

By using the model, we gained insights into how al- 
ternative landscape configurations may affect this as- 
pect of stopover ecology, and we generated predictions 
about the duration of stay that can be tested in future 
field studies. Analyzing the predictions of a general 
model can help identify aspects of stopover ecology 
that need addressing in future field studies. For the 
conservation manager, a general model could be used 
to evaluate management alternatives. For example, the 
performance of migrants could be simulated on alter- 
native landscape patterns produced by different policy 
options that affect land use on private lands (e.g., con- 
servation easements) or by the creation of new pro- 
tected areas under public ownership. Given the limited 
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resources for conservation, an approach like this can 
help managers select strategies to achieve the greatest 
positive change in the landscape with limited fiscal 
resources. Thus, landscape analyses and modeling 
studies provide a powerful tool to complement field 
studies. 


Results from This Study 


Habitat specialization emerged as an important influ- 
ence on the probability of successful stopover and the 
amount of time needed to rebuild energy reserves (Ta- 
bles 52.5 and 52.6). Our results indicate that the ef- 
fects of habitat pattern and abundance on migrants 
will be amplified for habitat specialists and when birds 
arrive at stopover sites with their energy reserves de- 
pleted. The difference in the relative abundance of 
habitats in these landscapes is not striking (Fig. 52.2). 
Although some differences in spatial arrangement of 
habitats are apparent in Figure 52.1, it is not clear 
from a cursory inspection of these maps which land- 
scape should be better or worse. A more thorough 
analysis was needed. While landscape pattern ranked 
the lowest among main effects in both ANOVAs, these 
analyses revealed that there were strong interactions 
among landscape pattern, habitat specialization, and 
arrival ESI for number of cells visited (Table 52.6). 
These results suggest that the differences in the abun- 
dance and spatial arrangement of habitats matter 
more for species with specialized habitat needs and 
less for generalists. Admittedly, this is an intuitive re- 
sult, but the degree of difference that habitat special- 
ization makes, and the relative ranking of these land- 
scapes, would not have been possible without the use 
of the model. 

Ranking landscapes depends on the method for 
quantifying habitat suitability. Table 52.7 shows that 
the exact ranking depends on the schema being used. 
Landscape C ranks third, fourth, or fifth, depending 
on which ranking method is employed. The differ- 
ences in the ranking of this landscape based on pro- 
portion of successful migrants versus number of cells 
visited is due to differences in performance between 
habitat generalists and specialists. Although the per- 
formance of generalists on this landscape was compa- 
rable to landscapes B and C, the performance of spe- 


cialists was much worse. Specialists had the lowest 
success rates on landscape C (Fig. 52.3) and required a 
higher number of cells visited (Fig. 52.4) than on any 
of the other landscapes. Choices about the most desir- 
able criteria for ranking landscapes must be made by 
knowledgeable managers within the context of specific 
conservation goals. Spatial models and landscape 
analyses can provide information useful for compar- 
ing alternative criteria. 

The results caused us to reevaluate the effects of 
some landscape features. A priori, we expected land- 
scape D to rank high because of the presence of two 
large river corridors dominated by high-quality decid- 
uous bottomland forest. However, this landscape 
ranked in the middle or lower half of the rankings 
(Table 52.7). The presence of large patches of bottom- 
land forest was counteracted by the abundance of 
large patches of low-quality habitat in the southern 
portion of the map. The southern region provides 
none of the category 4 habitats that provide high re- 
turns for the habitat specialists. Migrants landing in 
the southern half had to contend with these relatively 
poor habitats before encountering the richer sites to 
the north. 

The model output also assisted in interpreting the 
landscape metrics, such as contagion. While the rela- 
tive abundance of habitats affects the mean habitat 
quality, patch size, and interspersion of habitats affect 
the variance in foraging returns experienced by a bird 
as it moves across the landscape. Higher levels of in- 
terspersion (i.e., low contagion) reduce the variance in 
foraging returns for a bird visiting a fixed number of 
cells. Thus, in landscapes with the same levels of inter- 
spersion, mean habitat quality (calculated at a scale 
relevant to a single bird during stopover) is most im- 
portant. In a landscape with the same mean quality 
but lower levels of interspersion (i.e., high contagion), 
the variance among birds will be higher because some 
will encounter large patches of high-quality habitat 
while others encounter patches of low-quality habitat. 

Changes in interspersion can be good or bad de- 
pending on mean habitat quality. If the average habi- 
tat quality is great enough for most birds to be suc- 
cessful, then increasing interspersion will require birds 
to visit a larger area (i.e., more cells), because they will 
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inevitably encounter low-quality sites, although al- 
most all are assured to gather enough energy to mi- 
grate. Decreasing interspersion (increasing contagion) 
would mean that some birds would encounter large 
patches of rich and poor habitats. More birds would 
die without migrating because they landed in large 
patches of low-quality habitat. In contrast, decreasing 
interspersion could be a good thing for landscapes in 
which the mean habitat quality is so low that the aver- 
age foraging return is less than that needed for suc- 
cessful migration. If interspersion is high, practically 
no birds will be able to obtain enough energy during 
stopover because they will be receiving the mean re- 
turn for foraging on this landscape. Alternatively, if 
rich habitats are more clumped, then at least some 
birds will land in these areas and become successful, 
although most will perish because they landed in poor 
areas. This is a spatially explicit example of issues ad- 
dressed by the topic of risk-sensitive foraging (e.g., 
Caraco et al 1980; Stephens and Charnov 1982) in- 
vestigated in the field of optimal foraging theory. 
These issues are have been addressed in other systems 
where prey are cryptic and have heterogeneous spatial 
distributions by using a combination of field data and 
simulation modeling (e.g., ungulates; Turner et al. 
1994; Pearson et al. 1995). l 
This understanding helps interpret differences in 
the performance of migrants on different landscapes. 
Landscape E consistently ranked high because it has 
the highest abundance of category 3 and 4 habitats 
(Table 52.4). Moreover, this map has the highest level 
of contagion (Table 52.4), driven by the extensive 
well-connected regions of category 3 habitat. The 
number of successful migrants is highest on this map 
(Table 52.7, Fig. 52.3). Habitat generalists do better 
than specialists do in this landscape because of their 
higher foraging returns in category 3 habitats (Table 
52.2, Fig. 52.4). In contrast, compare landscapes A, B, 
and C. These three landscapes have a similar level of 
contagion (Table 52.4). However, the greater abun- 
dance of high-quality habitats in landscape A in- 
creases the average quality of cells encountered during 
stopover. This difference results in a greater chance of 
successful stopover (Fig. 52.3) and fewer cells being 
visited by successful birds (Fig. 52.4) than in land- 


scapes B and C. Whereas successful habitat generalists 
visit fewer cells than specialists in landscapes B and C 
(Fig. 52.4), the greater abundance of category 4 habi- 
tats in landscape A (Fig. 52.2) allowed specialists to 
visited fewer cells than specialists in this landscape. 
Specialists also: had a higher probability of successful 
migration in landscape A relative to B and C (Fig. 
52.3). Thus, this simulation model provided a means 
to evaluate the relative quality of landscapes from the 
perspective of birds that have different habitat needs. 
It also provides a mechanistic understanding of 
stopover habitat use that enhances our ability to inter- 
pret metrics of habitat pattern. 

The strengths and weaknesses of a given model 
should be known by the users. The main value of the 
simulation model used in this study is to illustrate the 
complex interactions that shape the process of song- 
bird migration. The factors shaping that process in- 
clude the pattern, abundance, and quality of stopover 
habitats, and the mobility and foraging ecology of in- 
dividual migrants. The individual-based model pro- 
vides a means to compare landscapes from a less-an- 
thropocentric perspective. The specific weaknesses of 
this model include the fact that (1) we have little 
knowledge about how variations in habitat quality at 
stopover sites translate into different rates of energy 
gain for migrants even though data on relative abun- 
dance of migrants, residency times, and fat condition 
in different habitats are available; (2) we have little 
data on movement patterns of migrants during at 
stopover sites (but see Aborn and Moore 1997); and 
(3) we know little about the settling patterns of mi- 
grants at migratory stopover sites. 


Important Considerations in the Use of 
Metrics and Models 


At present, approaches to measuring landscape char- 
acteristics include (1) metrics of landscape patterns, 
and (2) implementation of spatial models. Both of 
these methods allow the researcher or manager to 
rank a series of real or hypothetical landscapes based 
on the abundance and spatial arrangement of habi- 
tats. Ranking landscapes with respect to their suit- 
ability for a given population or suite of species can 
be challenging. Obviously, the ranking will depend 
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on the method for quantifying landscape suitability. 
Researchers and managers should ideally use meth- 
ods that are both realistic and appropriate for the 
species of interest. Landscape metrics are useful if 
they can be readily interpreted—that is, if variation 
in a metric can be directly related to an important as- 
pect of the species’ biology. Some progress has been 
made in this area, but better links between metrics 
and specific ecological processes need to be forged. 
Approaches such as the habitat suitability index 
move a step beyond the use of landscape metrics be- 
cause these indices can incorporate the positive and 
negative influences of diverse habitat measurements 
from a variety of scales. Spatial models can be more 
realistic because they can incorporate more-compli- 
cated aspects of the species’ habitat use, such as the 
consequences of movement and habitat use on sur- 
vival and reproduction. 

Although spatial models can be realistic, tradeoffs 
will exist between the general use of a model and the 
number of assumptions it makes about the species’ 
ecology. The design of any complex model will in- 
volve making assumptions about habitat selection or 
other aspects of habitat use. These assumptions 
should be testable, or models should be based on our 
best understanding of species biology. Models involv- 
ing many precise parameters and detailed mecha- 
nisms (e.g., fine-grained habitat selection, movement 
between cells) are fraught with many more assump- 
tions than models that have fewer details and are 
more general in design. 

Conservation programs often raise questions about 
the ecology of species, reserve design, or other man- 
agement issues for which definitive data are not avail- 
able and/or are difficult to collect. Spatial analyses and 
modeling are tools that can be useful in making the 
best of these difficult situations, as long as the model 
results are viewed within a context of model design, 
assumptions, parameter values, and spatial (or other) 
data. Models and analyses based on the current best 
understanding of a species’ ecology can make it possi- 
ble to pose meaningful “what if” questions about 
management alternatives or ecological processes. 
These explorations often generate a better understand- 


ing of the system in question, and they can produce 
hypotheses that can be tested with empirical data from 
field studies. 


Summary 


Stopover habitat use presents challenges to research 
and management due to its broad spatial extent and 
seasonal, ephemeral time span. A landscape-level ap- 
proach is essential. Understanding gained from field 
studies can guide landscape-level analyses that in turn 
can be used to develop testable hypotheses for future 
research or to inform difficult management decisions. 
This study produced findings relevant to management. 
Landscapes with the greatest amount of high-quality 
habitat were most suitable, but the spatial arrange- 
ment of habitats modified suitability. For example, the 
fragmentation of habitats as measured by intersper- 
sion can be good or bad depending on mean habitat 
quality experienced by migrants. Decrease intersper- 
sion if mean quality is too low. Spatial arrangement 
becomes more important as landscape-wide average 
habitat quality declines. Moreover, landscape suitabil- 
ity depends on habitat specialization; the relative per- 
formance of generalists and specialists depends on the 
details of the relative abundance and spatial arrange- 
ments of habitats. Given the broad spatial extent of 
bird migration, policies that favor the protection of 
high-quality habitats throughout the landscape, in- 
cluding private lands, would be more beneficial than 
the purchase and management of a few preserves of 
limited spatial extent. 
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Effects of Niche Width on the Performance 
| and Agreement of Avian Habitat Models 


Jeffrey A. Hepinstall, William B. Krohn, and Steven A. Sader 


onservation assessment and planning require 
knowledge about the regional occurrence of 
species as well as information about trends in species 
abundance (Morrison et al. 1992). Many methods 
exist to correlate species presence with their environ- 
ment, to derive habitat associations, and to use these 
associations to build predictive models of vertebrate 
species occurrence. Two general groups of habitat 
models are statistical models (e.g., Hepinstall and 
Sader 1997; Tucker et al. 1997; Schulte and Niemi 
1998; Dettmers et al., Chapter 54; Vernier et al., 
Chapter 50) and those models derived from matrices 
documenting species-habitat associations (Salwasser et 
al. 1980; Scott et al. 1993; Boone and Krohn 2000a,b). 
Because habitat association models and statistical 
models differ in their basic premises, one would ex- 
pect predictions of species presence to differ for mod- 
els using the different methods. Species-habitat associ- 
ation matrices are derived from the literature and 
from expert review and, therefore, presumably repre- 
sent the range of habitats used, at least in well-studied 
species. Species models derived from these habitat as- 
sociations, therefore, predict potential rather than ac- 
tual habitat for a species. Statistical models are de- 
rived from mathematical relationships between species 
data and field data and, therefore, represent at best the 
current habitat of a species. Given the differences in 
the input data used to build habitat association matrix 


models and statistical models, statistical methods 
would be expected, in general, to underpredict species 
presence with respect to the more general habitat asso- 
ciation models. 

One factor that may affect the results of compar- 
isons of model predictions derived from statistical 
models or habitat association models is breadth of 
niche (here used in the Grinnellian [Grinnell 1917] 
sense of niche as the range of environmental attributes 
enabling individuals to survive and reproduce) that a 
species can occupy. Species that are generalists (eury- 
topes) and use many different habitats could be pre- 
dicted to occur everywhere by habitat association 
methods. Species with narrower niches are more likely 
to be accurately predicted by both methods. However, 
habitat specialists (stenotopes) will be modeled well 
by either method only if the required habitat is 
mapped and correctly delineated. 

Testing model output is important in determining 
model performance and accuracy, but test approaches 
are limited by what they are compared against (Krohn 
1992; Fielding and Bell 1997). For example, model 
predictions for some gap analysis projects have been 
tested against species lists from national parks and na- 
tional wildlife refuges (Scott et al. 1993; Edwards et 
al. 1996; Krohn et al. 1998). Species lists may contain 
records of species that no longer occur at the site or 
may not record species that have recently moved into 
an area. Confusion matrices can be used to calculate a 
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variety of measures of agreement between observed 
species occurrences and predicted species occurrences 
(Fielding and Bell 1997). Errors of omission (not pre- 
dicting a species that was present) and commission 
(predicting a species that was not present) typically are 
calculated, as is correct classification rate (correctly 
predicted species presence and species absence). Un- 
derstanding the ecological context of model errors is 
essential to understanding model performance (Field- 
ing and Bell 1997). For example, when identifying 
areas of conservation concern for reserve creation, 
commission errors may be more detrimental than 
omission errors, because overpredicting many species 
may lead to incorrect assessments of which areas are 
of higher species richness. In contrast, reducing omis- 
sion errors may be important when predicting the dis- 
tribution of an endangered species. 

Interpreting the results of accuracy assessments re- 
quires an understanding of the biases inherent in the 
modeling process. Predictions based on habitat associa- 
tion models use general vegetation types as surrogates 
for a species’ habitat and thus assume that required mi- 
crohabitat elements will be present at least at some lo- 
cations; as a result, commission errors of species pres- 
ence will be higher than omission errors (Krohn 1996). 
Higher commission errors have been observed in sev- 
eral studies (Scott et al. 1993; Edwards et al. 1996; 
Block et al. 1994; Krohn et al. 1998; Karl et al., Chap- 
ter 51). However, if the objective of a study is to predict 
potential use of areas versus currently used areas, these 
commission errors are not necessarily serious errors 
(Edwards et al. 1996; Fielding and Bell 1997). 

The objectives of this study were to (1) compare 
spatially explicit predictions of species occurrence for 
land bird species in Maine derived from two modeling 
methods (one statistical and the other based on 
species-habitat association matrices), (2) test for the 
effect of species niche width on agreement between 
species predictions for each method, and (3) test for 
the effect of species niche width on model accuracy. 


Modeling Paradigms and 
Species Selection 


The selection of species modeled in this study was 
based on availability of survey data at a sufficient 


grain to build statistical models (n = 60) and validate 
model results (n = 20). Sufficient data were available 
to build and test models for twenty-eight species 
(Table 53.1). To rate species niche width, species were 
ranked by the number of vegetation and land-cover 
types used by each species according to the habitat as- 
sociations models used in the Maine Gap Analysis 
Project (ME-GAP; Boone and Krohn 1998a,b). 


Maine Gap Analysis 


ME-GAP (Krohn et al. 1998) used a thirty-seven-class 
vegetation and land-cover map (Hepinstall et al. 
1999) along with ancillary geographic information 
system (GIS) data to delineate habitat for each species. 
Bird species range limits were modified from DeGraaf 
and Rudis (1986), with modifications from several 
published sources (Adamus 1987; Erskine 1992; Foss 
1994; Gauthier and Aubry 1996), and expert review 
(Krohn et al. 1998). Species-habitat association matri- 
ces (heuristic measures) began with DeGraaf and 
Rudis (1986) and were modified with other published 
data and expert review (Krohn et al. 1998). The asso- 
ciation matrices formed look-up tables to recode the 
vegetation and land-cover types into used and unused 
types (Boone and Krohn 1998b). 

Normally, appropriate habitat polygons extending 
beyond a species’ range limit are included as predicted 
species presence (Scott et al. 1993). Because binary 
species predictions at the edges of a species’ range do 
not accurately reflect how a species occurs at range 
limits, a random feathering that eliminated an increas- 
ing proportion of the appropriate habitat toward the 
edge of a species’ range (Krohn 1996) was used to ap- 
proximate a theoretical range edge (Krohn et al. 1998; 
Boone and Krohn 1998b). Feathering was done 3 to 
50 kilometers from the edge of a species range, de- 
pending on each species’ mobility (Boone and Krohn 
1998c). Feathering of species predictions at the edge 
of their ranges will increase the disagreement between 
species predictions from each modeling method be- 
cause Bayesian predictions were made statewide. 
However, only four of the modeled species discussed 
in this chapter (Regulus satrapa, Parula americana, 
Sitta canadensis, and Sphyrapicus varius) had range 
limits in the state (all reach their southern limits in 


53. Effects of Niche Width on Avian Habitat Models 595 


TABLE 53.1. 


Species common and scientific names, species code, and the percentage of ME-GAP vegetation and 
land-cover types used by each species (based on Boone and Krohn 1998b). 


Common name 


Scientific name 


American robin 
American crow 
Common yellowthroat 
Song sparrow 

Eastern wood-pewee 
White-throated sparrow 
Nashville warbler 
Black-capped chickadee 
American redstart 
Chestnut-sided warbler 
Rose-breasted grosbeak 
Blue jay 

Hermit thrush 

Winter wren 
Yellow-bellied sapsucker 
Red-eyed vireo 

Purple finch 

Magnolia warbler 

Least flycatcher 

Veery 

Northern parula 
Yellow-rumped warbler 


Black-throated green warbler 


Black-and-white warbler 
Ovenbird 

Red-breasted nuthatch 
Blackburnian warbler 
Golden-crowned kinglet 


Turdus migratorius 
Corvus brachyrhynchos 
Geothlypis trichas 
Melospiza melodia 
Contopus virens 
Zonotrichia albicollis 
Vermivora ruficapilla 
Poecile atricapilla 
Setophaga ruticilla 
Dendroica pensylvanica 
Pheucticus ludovicianus 
Cyanocitta cristata 
Catharus guttatus 
Troglodytes troglodytes 
Sphyrapicus varius 
Vireo olivaceus 
Carpodacus purpureus 
Dendroica magnolia 
Empidonax minimus 
Catharus fuscescens 
Parula americana 
Dendroica coronata 
Dendroica virens 
Mniotilta varia 

Seiurus aurocapillus 
Sitta canadensis 
Dendroica fusca 
Regulus satrapa 


% of 
Species code Types used? 

AMRO 73 
AMCR 70 
COYE 65 
SOSP 65 
EWPE 59 
WTSP 54 
NAWA 51 
BCCH 49 
AMRE 49 
CSWA 46 
RBGR 46 
BUA 46 
HETH 46 
WIWR 46 
YBSA 46 
REVI 43 
PUFI 41 
MAGW 41 
BER 38 
VEER 38 
NOPA d 
YRWA 32 
BTNW 32 
BAWW 30 
OVEN 30 
RBNU 27 
BLBW 27 
GCKI 16 


aDefined as the percentage of vegetation and land-cover types (n = 37) used by a species in the habitat association 


models used in Maine Gap Analysis Project species models. 


extreme southern Maine), making concern over this 
potential source of error minimal. 


Bayesian Methods 


Species associations with environmental variables 
were based on species records in 1990 Breeding 
Bird Survey (BBS) data (J. A. Hepinstall et al. unpub- 
lished data). In 1990, thirty-nine BBS routes were 
run in Maine. Data were gathered during fifty three- 
minute point counts at 0.8-kilometer intervals along 
39.4-kilometer road routes. All bird species seen or 


heard within 0.4 kilometer of stop locations were 
recorded. 

To predict species occurrences, J. A. Hepinstall et 
al. (unpublished data) used six explanatory data lay- 
ers. Three were derived directly from unclassified 
1991 Landsat Thematic Mapper"'V! imagery: band 4 
(near-infrared), band 5 (mid-infrared), and a texture 
measure derived from the variance of normalized dif- 
ference vegetation index (NDVI) values within a 
210x210-meter window (forty-nine pixels). TM im- 
agery and variance texture data were stored as 8-bit 
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data (0-255). We grouped (binned) the data values by 
sets of five values to reduce the total number of classes 
in a data layer. Data values for each pixel within 400 
meters of BBS stops were extracted from the binned 
data sets. The remaining three layers were derived 
from the 1993 ME-GAP vegetation and land-cover 
map (Hepinstall et al. 1999): classes within 400 me- 
ters of BBS stop locations; classes within 200 meters 
of BBS stop locations; and vegetation class richness 
within 400 meters of BBS stop locations. The vegeta- 
tion class richness was calculated as the number of 
classes in a 210x210-meter window. 

Bayes’ Theorem provides a method for combining 
frequencies of association (conditional probabilities) 
between species presence and values in each explana- 
tory data layer with a priori (subjective) probabilities 
of occurrence to estimate posterior probabilities of 
species presence. Conditional probabilities given 
species presence or absence were derived for each 
value in each data layer (J. A. Hepinstall et al. unpub- 
lished data). Data layers were recoded using mean es- 
timates (based on one hundred bootstrapped samples) 
of conditional probabilities that were determined to 
be significantly different for species presence and ab- 
sence through a contingency table analysis. Probabili- 
ties from each data-layer value were combined using 
Bayes’ Theorem to produce a probability of species 
presence using Equation 53.3 from Hepinstall and 
Sader (1997). Model output ranged from 0 to 1.0, 
with values greater than 0.5 indicating predicted 
species presence. 

There were forty-seven possible model permuta- 
tions for each species (six possible data layers, but 
with no models combining the two data layers of the 
ME-GAP vegetation and land-cover map). Model per- 
formance was evaluated through a number of tests 
using 1990 BBS data, change from a priori, and differ- 
ence from a random model. The best Bayesian model 
for each species was defined as the satisfactory model 
with the highest agreement with BBS verification data. 


Between Model Comparisons 


The best Bayesian model for each species was com- 
pared with the prediction from ME-GAP. Because 
both methods produce spatially explicit measures of 


species presence for the state of Maine, a complete 
cross-tabulation of each model’s prediction was 
possible. The statewide correct classification rate 
(CCR; Eq. 53.1) was calculated for each species 
(Fielding and Bell 1997). 


(a+d) 


Correct Classification Rate = (Gere ED 


(53.1) 
where a, b, c, and d are taken from the agreement ma- 
trix in Figure 53.1. 

The McNemar test (Conover 1980) was used to 
test if ME-GAP “overpredicted” species occurrence 
for significantly more area than Bayesian predictions 
(b and c in Fig. 53.1) (Eq. 53.2). 


(53.2) 


where b and c are taken from the agreement matrix in 
Figure 53.1. The test statistic would be positive if 
more area was predicted as species presence in ME- 
GAP models and species absence in Bayesian models 
and would be negative if the opposite were true. A 


1) Bayesian Prediction 


E 


ME-GAP 


Prediction 


2) Field Data (observed) 


ME 
= 


Figure 53.1. (1) Example agreement matrix comparing Maine 
Gap Analysis Project (ME-GAP) and Bayesian predictions and 
(2) confusion matrix comparing observed to predicted. 


Predicted 
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simplified measure, the off-diagonal ratio (Eq. 53.3), 
which varies from 0 to 1 and equals 0.5 when the er- 
rors are balanced between b and c, was used to evalu- 
ate the directionality of model prediction errors. 


Off - diagonal Ratio = E (53:3) 


(bc 
where b and c are taken from the agreement matrix in 
Figure 53.1. 

Linear regression analysis was used to test for a sig- 
nificant relationship between (1) measures of agree- 
ment of ME-GAP and Bayesian species predictions 
(CCR and off-diagonal ratio) and (2) species niche 
width (percentage of vegetation and land-cover types 
used by a species from ME-GAP). 

Spatial autocorrelation, which is the tendency of 
nearer objects to be more similar (or more dissimilar) 
than expected by chance, may have affected the direct 
comparisons of model predictions from both methods. 
Measurements of agreement, such as the correct clas- 
sification rate, should be larger within smaller dis- 
tances from random points if positive spatial autocor- 
relation exists. Although we did not measure spatial 
autocorrelation directly, the comparison of CCR and 
the off-diagonal ratio at various distances from ran- 
dom points can be used to index the spatial juxtaposi- 
tion of model predictions. To determine if there were 
biases associated with comparing the species predic- 
tions for each method on a statewide, pixel-by-pixel 
basis, we calculated CCR and off-diagonal ratio for 
five hundred points randomly generated throughout 
the state. We buffered these five hundred points at dis- 
tances of 50, 200, and 400 meters. We calculated the 
model agreement between Bayesian predictions and 
ME-GAP predictions for the buffer strips of 0-50 me- 
ters, 51-200 meters, and 201-400 meters. If the meas- 
ured variables reached an asymptote at 51-200 me- 
ters, it would potentially indicate a spatial limit to the 
autocorrelation of agreement measures. 


Model Verification: 
Agreement with BBS Data 


We calculated the agreement between model predic- 
tions for each method and BBS 1990 stop data. Be- 
cause the BBS data were used to build the Bayesian 


models, measures comparing Bayesian predictions 
with species observed in the BBS record are only veri- 
fication of model formulation (see Conroy and 
Moore, Chapter 16) and not of model validation (ac- 
curacy of model at predicting species presence or ab- 
sence at new sites). 

We assessed agreement between predicted and ob- 
served species presence using two methods. For the 
first test, we scored a species as predicted to be present 
at a BBS stop location if any model output value 
within 50 meters of a BBS stop was greater than 0.5. 
Our rationale for querying only pixels within 50 me- 
ters of a survey point location was to limit random 
agreement between the model prediction and the BBS 
record. However, this measure would be positively bi- 
ased toward ubiquitous species. We calculated the 
number of stops where a species was observed in the 
1990 BBS data and was predicted to occur by each 
method. Because BBS survey data do not explicitly 
measure species absence on a site, measurements of 
agreement that incorporate b or d (Fig. 53.1) were in- 
appropriate (see discussion of commission error rates 
in Schaefer and Krohn, Chapter 36). Instead, we cal- 
culated the positive agreement of model predictions 
with BBS data according to Equation 53.4. 


Positive Agreement = (53.4) 


(a+c) 
where a, b, and c are taken from the confusion matrix 
in Figure 53.1. Positive agreement is defined as the 
conditional probability that a site correctly classified 
species occurrence given observed species presence 
(Fielding and Bell’s [1997] “sensitivity”). We also cal- 
culated the CCR for each species using the data within 
50 meters of BBS stops where the species was ob- 
served (Eq. 53.1). Because field data do not actually 
record all species present on a site (Boone and Krohn 
1999), positive agreement is a less-biased measure 
than CCR, which is likely to be negatively biased. 

As a comparison of the above measures of positive 
agreement and CCR, we tested for a significant differ- 
ence in the mean proportion of predicted species oc- 
currence for the area within 200 meters of point loca- 
tions where a species was observed and where a 
species was not observed. We used a one-tailed t-test 
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with unequal variances (Welch's approximation; Zar 
1996:129) to test for differences. This test was less bi- 
ased toward ubiquitous species than the measures of 
positive agreement and CCR within 50 meters were. 


Model Validation 


We also tested our model predictions against field data 
from Manomet Center for Conservation Sciences (J. 
Hagan personal communication) to validate model 
predictions (Conroy and Moore, Chapter 16). Ten- 
minute point counts (n = 387) were run twice during 
the breeding season (1992 and 1993) in west-central 
Maine (Hagan et al. 1997). We calculated the same 
measures of agreement between model output and the 
Manomet field data as we calculated for the BBS data 
(i.e., positive agreement and CCR within 50 meters 
and significant difference of means within 200 meters 
of survey point locations). 

Linear regression analysis was used to test for a sig- 
nificant relationship between (1) measures of agree- 
ment between model predictions and field data (posi- 
tive agreement and CCR) and (2) species niche width 
(percentage of vegetation and land-cover types used 
by a species from ME-GAP). 


Results 


Agreement between Methods 


Three of the top four eurytopic species (as ranked by 
their use of vegetation types in the ME-GAP species- 
habitat association matrices: Turdus migratorius, Ge- 
othlypis trichas, and Melospiza melodia) had low (less 
than 30 percent) CCR between the two prediction 
methods (Table 53.2). Only one other species, Regulus 
satrapa, had a CCR value of less than 50 percent. 
Only three species (Troglodytes troglodytes, Sphyrapi- 
cus varius, and Dendroica magnolia) had greater than 
70 percent CCR. These three species were modeled by 
ME-GAP to use 41—46 percent (ranked fourteenth, fif- 
teenth, and eighteenth of the twenty-eight species) of 
the available vegetation and land-cover types (Table 
53.1). Maps of the Bayesian and ME-GAP predictions 
for thee species (Corvus brachyrhynchos, Troglodytes 
troglodytes, and Dendroica fusca) clearly show the 


TABLE 53.2. 


Correct classification rate (CCR), off-diagonal ratio, and 
McNemar’s test statistic calculated for statewide Bayesian 
predictions and Maine Gap Analysis predictions (Krohn et al. 
1998) for each of the twenty-eight species modeled. 


Off- McNemar’s 

diagonal test 
Species name CCRa ratio? statistic® 
Turdus migratorius 27.8 0.935 2440 
Corvus brachyrhynchos 512 0.946 2024 
Geothlypis trichas 28.4 0.955 2495 
Melospiza melodia 322 0.957 2442 
Contopus virens 52:3 0.862 1625 
Zonotrichia albicollis 61.6 0.895 1580 
Vermivora ruficapilla 54.8 0.863 1578 
Poecile atricapilla 58 0.838 1401 
Setophaga ruticilla 52.4 0.766 1184 
Dendroica pensylvanica 52.4 oser -728 
Pheucticus ludovicianus 55.4 0.882 1824 
Cyanocitta cristata 55.3 0.811 1342 
Catharus guttatus 61.7 0.538 153 
Troglodytes troglodytes 72.0 0.710 719 
Sphyrapicus varius 71.6 0.810 1068 
Vireo olivaceus 63.4 0.697 775 
Carpodacus purpureus 5316 0.727 997 
Dendroica magnolia 12.2 0.647 499 
Empidonax minimus 70.8 (0755 200 
Catharus fuscescens 65.4 0.643 543 
Parula americana 5 0.649 628 
Dendroica coronata 59.1 0.401 -409 
Dendroica virens 64.8 0.706 794 
Mniotilta varia 56.7 0.431 -293 
Seiurus aurocapillus 64.3 0.610 428 
Sitta canadensis 59,7 0.341 -651 
Dendroica fusca 58.9 0.653 632 
Regulus satrapa 48.2 0.864 1691 


Correct classification rate is the percentage of overall agreement 
between species predictions from Bayesian and Maine Gap Analysis 
models (joint predicted species presence and joint predicted absence); 
Equation 53.1. 

bOff-diagonal ratio is a measure of the directionality of disagreement 
between predictions from Bayesian and Maine Gap Analysis methods 
(Equation 53.3). Values above 0.5 indicate ME-GAP models overpredict 
species presence with respect to Bayesian models. 

*Conover (1980); Equation 53.2, positive measures indicate Maine 
Gap Analysis models overpredict species presence with respect to 
Bayesian models. 
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Figure 53.2. Predicted occurrences (black) of the American crow (Corvus brachyrhynchos) in Maine from the Bayesian method 
and the Maine Gap Analysis Project (ME-GAP) method. This species uses 70 percent (second-widest niche of the twenty-eight 
species modeled) of the vegetation and land-cover types present in the ME-GAP map (Boone and Krohn 1998b). 


trend between model agreement and species niche 
width (Figs. 53.2, 53.3, and 53.4). 

Species niche width did have an effect on the agree- 
ment between model predictions from the two meth- 
ods. The values for CCR, off-diagonal ratio, and Mc- 
Nemar's test statistic were generally poorer for the 
more-generalist species than for the more-specialist 
species (Table 53.2). Correct classification rate de- 
creased significantly (P = 0.0139, r? = 0.21, slope = 
—0.416) as the percentage of vegetation and land- 
cover types used increased. The off-diagonal ratio in- 
creased significantly (P = 0.001, r? = 0.34, slope = 
0.0021) with increased use of vegetation and land- 
cover types. i 

The potential effects of spatial autocorrelation, as 
indexed by the CCR and off-diagonal ratio measured 
for five hundred random points at three distances (50, 
200, and 400 meters) may be limited to less than 400 
meters (Fig. 53.5). For fifteen of the twenty-eight 


species modeled, CCR rates were maximized at 200 
meters, potentially indicating a limit to the direct spa- 
tial coincidence of predictions for the two methods. 
Such patterns were not as clear with the measures of 
off-diagonal ratio for each buffer-strip distance. How- 
ever, the off-diagonal values tended to decrease for the 
more-specialized species (right side of graphs in Fig. 
53.5). Two species, Empidonax minimus and Regulus 
satrapa, had off-diagonal ratios below 0.5 for all three 
distances, but off-diagonal values above 0.5 when 
species predictions were compared statewide (Table 
83.2). 


Model Verification and Validation 


Species predictions from the Bayesian models, by defi- 
nition of a satisfactory model, were all equal to or 
greater than 70 percent (Table 53.3). ME-GAP predic- 
tions for two species (Regulus satrapa and Dendroica 
pensylvanica) were the only species with positive 
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Figure 53.3. Predicted occurrences (black) of the winter wren (Troglodytes troglodytes) in Maine from the Bayesian method 
and the Maine Gap Analysis Project (ME-GAP) method. This species uses 46 percent (fourteenth-widest niche of the twenty- 
eight species modeled) of the vegetation and land-cover types present in the ME-GAP map (Boone and Krohn 1998b). 


agreement less than 70 percent with the BBS data. 
Positive agreement of ME-GAP predictions with 
Manomet data was less than 70 percent for two 
species (Mniotilta varia and Regulus satrapa). For 
seven species, generally the more stenotopic species of 
the species modeled, Bayesian model predictions had 
higher positive agreement with Manomet data than 
ME-GAP model predictions did (Table 53.3). 

The CCR was lower than the positive agreement 
for both modeling methods for all species except for 
ME-GAP predictions for Dendroica pensylvanica 
(BBS) and Regulus satrapa (BBS and Manomet) and 
Bayesian predictions for Turdus migratorius and Ge- 
othlypis trichas (Manomet) (Table 53.3). CCR was 
much lower than positive agreement for ME-GAP pre- 
dictions for six species (Table 53.3), dropping from 
more than 90 percent positive agreement with BBS to 
less than 35 percent CCR. The same trend was seen 
for seven species when compared with Manomet data. 


All of these species were the more eurytopic species 
modeled. The differences between positive agreement 
and CCR were generally much less for Bayesian pre- 
dictions. 

Agreement between predicted and observed species 
presence within 200 meters of survey locations dif- 
fered from agreement within 50 meters of survey loca- 
tions (Table 53.3). No relationship was observed be- 
tween the measure of significant agreement at 200 
meters and species niche width for either Bayesian or 
ME-GAP predictions, although species with significant 
ME-GAP agreement with BBS data were skewed to- 
ward the more specialist species. 

Significant trends existed between species niche 
width (percentage of vegetation and land-cover types 
used) and measures of positive agreement and CCR 
for BBS data and Manomet data for ME-GAP predic- 
tions (Table 53.4). Positive agreement with field data 
increased with increased number of vegetation types 
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Bayesian Prediction 


Blackburnian Warbler 


40,330 km? of Predicted Species Presence 
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Figure 53.4. Predicted occurrences (black) of the Blackburnian warbler (Dendroica fusca) in Maine from the Bayesian method 
and the Maine Gap Analysis Project (ME-GAP) method. This species uses 27 percent (twenty-seventh widest niche of the 
twenty-eight species modeled) of the vegetation and land cover types present in the ME-GAP map (Boone and Krohn 1998b). 


used by a species; the opposite trend was observed 
with measures of CCR, indicating the increase in pos- 
itive agreement is likely an artifact of increased area of 
predicted species presence rather than an increase in 
the accuracy of model predictions. No significant 
trends were observed with measures of agreement 
(CCR and positive agreement) between Bayesian pre- 
dictions and field data. 


Discussion 


Habitat association models and statistical models dif- 
fer in their basic premises. Habitat association meth- 
ods, being based on literature review and expert 
knowledge, will more likely model potential rather 
than actual habitat for a species. Statistical models, 
being derived from field data, at best will model the 
current (at the time the field data were gathered) dis- 
tribution of a species. 


Habitat association methods based on associations 
observed over larger temporal and spatial ranges, 
yield binary responses predicting the areas that could 
possibly be suitable for a species based on the presence 
or absence of habitat elements. Rule-based models de- 
rived from simple heuristics, such as species-habitat 
association matrix models, will tend to integrate over 
the variability inherent in statistical models. There- 
fore, rule-based models are expected to be more gen- 
eral in their predictions of species occurrence and will 
be less influenced by yearly variation in habitat use 
patterns. 

The advantages of using species-habitat associa- 
tions to predict species presence are that the predic- 
tions can be made for large spatial extents (but see 
Austin 1999b for an example of statistical models 
for large spatial extents) and for as many species as 
there exists adequate knowledge for general rules of 
association. The disadvantages are that (1) species 


A) Correct Classification Rate (CCR) 


0.80 


0200 m 
@400 m 


g50m 


RSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS 


[SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSNSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSNSSSSSSSSSSSSSSSSSSSS] 


[SSSSGOSSSSSSSSSSSSSSSSISSSSSSSSSSSSSSSSSSSSNSSSSSSSSSSSSSS 


——MD——— en 7] 
BSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSNSSSSSSSSSSS ^o 25 


SSS SSS eS SS 


MEI M nnmdi Dy 
C4 


a eS a re M 


SSSSSSSS HSS HSS SS TSG 3, 


ES Se — cJ] 2 — A6 ^) 


[SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSNSSSNSSSSSSSSSSSSSSSSSSSSSSSSSI 


pump Ed 


SSS 


SSS ŅY 


BSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS 96, 


= WR $5 


RSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS 


BSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS 


se ET ANC Ly 


0.70 


RSSSSSSSSSSSSSSSNSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS Se 


SSS SS Ly 


a ea all * 


RSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSNSSSSSSSSSSSSSSSSSSSSSSSSSSSSS 


Se TRENT 757] ss RR "2 


RSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSNSSSSSSSSI 


a ———-—-— S — n i % 
[7 


[SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS 


SSSA 
Í o nn 


(a a ca Y 


[SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSNSSSSSSSSSNT A 


RSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSNSSSSSSSSSSSSSS P 
EAEE — m  " 


[SSSSSSSSSSSSSY (7 
as 9$ 
ESSSSSSSSSSSSSSSSSSSSSSS] O, 
$ 
ESSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSNSS % fo) 


0.60 

OQ 0.50 

0.40 

0.30 
0.20 - 
"i 


Species 


B) Off-diagonal Ratio 


1.00 


m50 m 
0200 m 
400 m| 


paneminen 

ENMEL- —— 
SSS 

= 


SSS NS 


SSS SSS EA 


SSSSSS NS 


ESSSSSSSSSSSNSSSSSSSSSSSSSSSSSSSSSSSSSSSSS 


(SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSY 


[SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS, 


SSS Sh 


DiS a ES a) 


SANNA ALARA ARERR EARN 


ANNAN AANANANAAAA AAA AA AAA RAR ARRAN AAA LAAN AREA REAR ANANDA RAN ARRAN 


| SSSSSOSSSSSSISSSSESSISASSSSSESSNSSSSS SISSIES 


(SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS] 


BSSSSSSSSSSSSSSSSSSSSSNSSSSSSSSSSSSSSSSSNSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSNSSSSSSSSSSSSSSSSSSSSSS 


Figure 53.5. (A) Correct classification rate (CCR) and (B) off-diagonal ratio values for agreement between 
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Bayesian and Maine Gap Analysis Project (ME-GAP) species predictions calculated for five hundred random 


points located throughout Maine for three distances from point locations (0-50 meters, 51-200 meters, and 
201-400 meters). Species are ordered by decreasing number of vegetation and land-cover types used from left 


to right. Species AOU codes given in Table 53.1. 


TABLE 53.3. 


Percent agreement (positive agreement and correct classification rate [CCR]) and significant agreement of Maine Gap Analysis 
Project (ME-GAP) predictions and Bayesian predictions with verification data (Breeding Bird Survey [BBS]) and validation data 
(Manomet Center for Conservation Sciences field data [Manomet]). 


BBS Manomet 
Positive Difference: Positive Difference: 

agreement? CCR> t-test valuec agreement? CCR t-test value 

all ae PUB NEC MM og 

S u 3 ü S i S i S S Wu 

ul a ul P ul = ul Aa ul x Ww x 

= m = m = re) = e = tü = m 
Turdus migratorius 0.98 0.93 0.27 O41  -2.6 4.094 1.00 0.43 O87 Oor "Z027 TOMO 
Corvus brachyrhynchos 0.99 0.92 0.17 0.50 -3.5 9.4 1.00 0.83 0.07 0.41 0.46 0.15 
Geothlypis trichas 0.97 0.91 0.237 0.40 0.3 6.6 1.00 068 0:31 0.66 1.90 4.73 
Melospiza melodia 0.97 0.94 0.17 O61 -3.9 166 1.00 On Ano mel 0.69 096 4.46 
Contopus virens 0.78 0.92 0.39 0.55 -0.1 3.8 1.00 0.7/0) 023 0.60 1.12 -0.97 
Zonotrichia albicollis 0.91 0.88 0.44 0.60 8.9 9.3 1.00 0.87 0.44 0.62 0.81 «3.48 
Vermivora ruficapilla 0.95 092 013 (Olea) -1.5 5.4 1.00 0.86 0.34 0.58 0.67 5.37 
Poecile atricapilla 0.80 0.90 0.39 0.51 -15 0.2 0.97 073 TOS 0.51 0.82 -0.60 
Setophaga ruticilla 0.80 0.80 0.41 0.57 115 3.3 0.95 058 TORT 0.48 1.00 -1.7/7 


Dendroica pensylvanica 0.53 0.96 0.70 0.32 0.1 4.6 0.76 0.62 0.61 0.62 4.00 1.57 
Pheucticus ludovicianus ` 0.77 0:81 0.59 0.59 1-9 2.2 0.67 0.59 0.61 0.39 -0.32 -3.87 


Cyanocitta cristata 0.82 0.95 0.39 0.37 ` -0.8 4.3 0.97 0.69 0.27 0.62 0.45 -0.81 
Catharus guttatus 0.94 087 0:258 70'S 0.0 7.6 0.89 0195; 0/51 0.50 -0.60 -1.75 
Troglodytes troglodytes 0.87 0.89 0.52 0.64 9.2 8.2 0.94 0.74 0.65 0.65 2.00 0.79 
Sphyrapicus varius 0.96 0.95 0.38 0.51 6.0 7.9 1.00 0.86 0.27 0.46 0.28 -1.12 
Vireo olivaceus 0.82 0.89 0.33 0.49 5.6 1.8 1.00 Quo p OM 0.73 3.00 3.29 
Carpodacus purpureus 0.75 0.85 0.44 0.63 -0.6 4.2 0.86 0.75 0.28 0.53 -121 009 
Dendroica magnolia 0.89 0.95 0.47 0.62 7.8 11.1 0.89 0.83 0.62 0.67 2.10 2.36 
Empidonax minimus 0.72 0.85 048 0.65 1.8 5.3 0.72 0.78 0.62 0.53 2.65 0.89 
Catharus fuscescens (0) 74! 0.80 0.54 0.68 0.3 6.4 0.86 0.89 0.52 0.39 1.80 0.35 
Parula americana 0.82 OH Osis i051 5.7 8.4 0.78 0.75 0.44 0.52 0.90 0.87 
Dendroica coronata 0.70 os 0557 06L -14 5.8 0.70 0.93 0.62 0.59 -0.18 -2.24 
Dendroica virens 0.81 0.92 052 0.58 5.4 7.6 0.88 0.86 0.63 0.64 4.50 -0.14 
Mniotilta varia 0.71 0.91 0.57 0.49 ial 4.1 0.69 0.86 0.54 0.40 -0.08 -0.12 
Seiurus aurocapillus (09705 0.87. 0.55 £0.48 4.5 5.1 0.90 0.82 0.67 0.73 5.90 - 7.06 
Sitta canadensis 0.84 0194 29751531 057 1.8 7.5 0.90 0.94 0.62 0.56 2.00 -2.69 
Dendroica fusca 0.77 0.95 0.59 0.54 4.4 6.2 0.84 OLTA OSS 0.53 0.90 -1.37 
Regulus satrapa 0.49 0.93 0.78 0.60 1.6 5.7 0.47 0.72 0.63 0.64 0.90 2.16 


aPositive agreement is the percentage the joint predicted and observed is of the total observed within 50 meters of field survey points (Equation 


53.4). ; 

bCorrect classification rate is the percentage of overall agreement between species predictions and field observations (joint predicted presence and 
joint predicted absence); Equation Serie i 

cThe ttest value for difference in the proportion of the area with predicted species presence within 200 meters of survey points where the species 
was observed where the species was not observed. 

dBoldface type indicates one-tailed (positive) significant (P < 0.05) difference. 
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TABLE 53.4. 


Results for agreement measures between model predictions 
and verification data (Breeding Bird Survey [BBS]) and 
validation data (Manomet Center for Conservation Sciences 
field data) regressed against the percentage of vegetation and 
land-cover types used by each of the twenty-eight species 
modeled. 


BBS Manomet 
Positive Positive 
agreement CCR agreement CCR 
Slope 0.0062 —0.009 0.0068 —0.011 
P-value — 0.0001 « 0.0001 « 0.0001 « 0.0001 
r2 0.461 0.601 0.480 0.542 


habitat relationships are based on descriptive rather 
than statistical relationships, (2) predictions are bi- 
nary (presence or absence) or categorical (e.g., rarely 
used, sometimes used, often used) with no associated 
probability of finding a species, and (3) predictions 
are of large spatial resolution (grain), usually greater 
than 90 meters and often as large as 1 kilometers. 
Rule-based approaches also may be more limited 
than statistical methods in their inclusion of data lay- 
ers other than vegetation and land-cover types be- 
cause of the lack of known relationships between 
species presence and derived environmental data 
layers. 

Limitations of the statistical method described in 
this chapter include (1) incomplete associations be- 
tween explanatory variables and species presence due 
to incomplete species counts (Boone and Krohn 
1999), (2) the ability to model only species with suf- 
ficient field data to build valid statistical relation- 
ships, and (3) yearly variation in species distribution 
affected by the particular spatiotemporal dynamics 
functioning at both local and regional scales (Wiens 
1981b, 1989c; Maurer, Chapter 9). The number of 
species successfully modeled using a statistical ap- 
proach potentially could be increased by pooling 
multiple years of BBS data for species that are less 
common, but issues of pseudo-replication would 
have to be addressed. 

The ability to make precise predictions about what 
is appropriate habitat for a species will be directly re- 
lated to the breadth of habitats used by that species 


(Csuti 1996). It is unlikely that species with narrow 
habitat requirements would be clearly correlated with 
unclassified satellite imagery or vegetation cover types 
derived from satellite imagery. The probability of a 
landscape containing microhabitat resources required 
by a species increases as spatial extent increases (Csuti 
1996). However, if relationships between species pres- 
ence and vegetation types are too general, associations 
will be so broad as to yield predictions of species pres- 
ence everywhere, which may not accurately depict 
even species potential distribution (see also Karl et al., 
Chapter 51 and Dettmers et al., Chapter 54). Johnson 
and Krohn (Chapter 13) also found differences in the 
error rates of predictions for three species of seabirds 
and related this variability to differences in species 
niche width. 

Agreement between the predictions for the habitat 
association method and statistical method tested in 
this study was variable, but, as expected, the statistical 
models tended to underpredict species presence rela- 
tive to habitat association model output. Gap predic- 
tions are known to overestimate (commission errors) 
species presence (Scott et al. 1993; Edwards et al. 
1996; Krohn et al. 1998; Garrison and Lupo, Chapter 
30) more often than underestimate presence (omission 
errors), although high commission error can result 
from incomplete field surveys (Boone and Krohn 
1999). As predicted, ME-GAP methods predicted 
more occurrences for the more eurytopic species than 
were predicted by the Bayesian methods. For four 
species (Dendroica pensylvanica, Dendroica coronata, 
Mniotilta varia, and Sitta canadensis), Bayesian mod- 
els tended to overpredict species occurrence with re- 
spect to ME-GAP predictions. These species, with the 
exception of the Dendroica pensylvanica, tended to be 
the more stenotopic species. 

ME-GAP methods clearly overestimated species oc- 
currence when compared against field data. The posi- 
tive agreement of ME-GAP predictions with BBS and 
Manomet field data generally increased from species 
habitat specialists to habitat generalists. The trend in 
correct classification rate was exactly the opposite: 
ME-GAP predictions for the generalist species poten- 
tially greatly overpredict the current distribution of 
generalist species. However, the CCR metric is de- 
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pendent on the assumption that field data is complete 
(i.e., few to no nondetected birds). 

The correct classification rate for ME-GAP mod- 
els between observed and predicted species occur- 
rences was much lower for most species than the 
positive agreement (correctly predicted presence) 
was. However, because field test data were survey 
data, species absence was not explicitly recorded. 
Therefore, comparing the positive agreement of 
model predictions with field data is more appropri- 
ate than testing the correct classification rate, which 
will tend to inflate errors. Also, because the purpose 
of GAP prediction is to attempt to predict potential 
habitat versus actual habitat and to aid in conserva- 
tion planning, Edwards et al. (1996) argues that er- 
rors of commission are preferable to errors of omis- 
sion. However, because species predictions from 
GAP models are ultimately pooled to create maps of 
species richness, commission errors may mask true 
areas of species richness if the patterns of commis- 
sion errors exhibit positive spatial autocorrelation. 
Testing the spatial pattern of commission errors 
through the inclusion of neighborhood values (e.g., 
Augustin et al. 1996; Fielding and Bell 1997; Klute et 
al., Chapter 27) potentially could be used to correct 
for spatial patterns of errors. Ultimately, field verifi- 
cation will be needed to test any model prediction, 
whether attempting to protect species richness or a 
single endangered species. 

Because birds more often respond to structural el- 
ements than to the floristic composition of their en- 
vironment (Cody 1985b), the vegetation and land- 
cover map created for ME-GAP concentrated on 
delineating general vegetation types. The general 
vegetation and land-cover types, while sufficient for 
ME-GAP predictions, may have been too general in 
classification, yielding nonsignificant relationships 
in the Bayesian model parameterization. Habitat 
generalists (e.g., Geothlypis trichas), which will 
more likely be overpredicted than underpredicted by 
habitat association methods, would likely not be 
modeled well by Bayesian methods unless the 
species was abundant and clear differences in pres- 
ence/absence existed between vegetation and land- 
cover types. 

ME-GAP predictions for one species (Regulus 


satrapa) had poor agreement with BBS and Manomet 
data. The ME-GAP model predicted much less area 
of habitat for Regulus satrapas than Bayesian models 
predicted (28,355 versus 36,098 square kilometers, 
respectively). Regulus satrapas used the fewest vege- 
tation and land-cover classes of the species discussed 
in this study. It is likely that the mapped classes did 
not correspond well to the habitat used by Regulus 
satrapa, and the decision to label each vegetation 
and land-cover class as used or unused—an example 
of misclassified cases (see Fielding, Chapter 21)—un- 
derestimated the available habitat for Regulus 
satrapa. 

Changes in vegetation due to conversion or forest 
harvest operations will change available habitat for 
some species. Species increasing in range or decreasing 
in number should have more potential habitat versus 
occupied habitat. Species with stable populations that 
have occupied all potential habitat should have similar 
values for potential and occupied habitat. Krohn 
(1996) postulated that species-habitat relationships 
measured at time t4 and predicted for time t? will lead 
to models with poor fit or erroneous predictions un- 
less the populations under consideration are at or near 
their carrying capacity. Low populations will lead to 
underestimating the full range of habitats the species 
will potentially use, whereas high population levels 
will have species present in sub-optimal habitat 
(Brown 1969a; Fretwell and Lucas 1969; O'Connor 
1986). It is likely that population status influenced 
both the comparisons of model predictions for each 
method and agreement of model predictions with field 
data. 

Problems can arise in interpreting the results of ac- 
curacy assessments for at least five reasons: (1) incom- 
plete or unrepresentative reference data sets (e.g., Ed- 
wards et al. 1996; Boone and Krohn 1999; Fielding, 
Chapter 21); (2) unsaturated populations; (3) site fi- 
delity; ( 4) temporal and spatial changes; and (5) in- 
correct accuracy measurements (see Fielding, Chapter 
21). Incomplete reference data sets will inflate meas- 
ures of commission errors (e.g., Edwards et al. 1996) 
and may deflate measures of overall prediction accu- 
racy. If the local population has not saturated the 
available optimum habitat, individuals may not be ob- 
served in all available areas. In this scenario, models 
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may either fail to predict suitable habitat or the 
commission error rate may be increased above the 
“true” rate (Fielding and Bell 1997). Species that are 
philopatric will remain in suboptimal habitat even 
when optimal habitat becomes available. If a species’ 
local population is declining from a once-high level 
where many sub-optimal areas were occupied, models 
predicting species occurrence may actually measure 
the “wrong” (i.e., sub-optimal) associations between a 
species and its environment. Temporal and spatial 
changes in species distribution will most likely follow 
changes on the landscape, although studies have ob- 
served lags in population changes following environ- 
mental changes (e.g., Wiens 1989c). Krohn (1996) 
showed that while the number of forest bird and small 
mammal species did not change with time at one site 
in Maine, the cumulative number of species increased 
over the same time period. Because data sets for test- 
ing species model predictions are rarely gathered for 
the same time period for which predictions are based, 
temporal changes in species composition will lead to 
calculated errors in model accuracy that may not be 
on-the-ground errors. 

If populations were fluctuating during the time pe- 
riod being modeled, agreement between the two 
modeling approaches may reflect this variation as 
discussed above. If, for example, a species was below 
carrying capacity, then habitat association methods, 
when based on robust literature covering many envi- 
ronmental conditions, would presumably predict a 
wider range of potentially used habitats than the sta- 
tistical models would predict. In this instance, statis- 
tical models would likely predict currently used habi- 
tats and would underestimate potential species 
distributions. When populations are close to K, 
model predictions may be similar because the pat- 
terns of recorded and observed use of vegetation and 
land-cover type would be similar (Krohn 1996). This 
agreement will hold only if populations remain rela- 
tively stable. 


Conclusions and Management 
Implications 


The results of the study presented in this chapter sug- 
gest the importance in comparing the results from dif- 
ferent methods of predicting species occurrences. 
Specifically, statistical methods generally will be more 
conservative than habitat association models in the 
area of predicted habitat for a species. Additionally, 
species niche width will affect the agreement between 
predicted species occurrence and measured presence; 
generalists will generally be more overpredicted by 
habitat association methods with respect to statistical 
methods than will specialists. Finally, a single view of 
a species’ distribution will always be constrained by 
the data used to build the model and the assumptions 
inherent in each attempted modeling approach. Com- 
parison of the predictions from highly different 
methodologies can be used to better understand natu- 
ral processes and to aid managers in decision making. 
Ultimately current field data will be required to accu- 


rately gauge predictions from any type of model. 
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Bird Occurrence at a Regional Scale 
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mpirically based wildlife-habitat models that can 
predict species occurrence over large spatial ex- 
tents (e.g., regional areas of millions of hectares) can 
be very useful in developing regionally based or 
ecosystem-level management plans for wildlife re- 
sources. Numerous statistical modeling techniques 
exist for developing habitat-based models for predict- 
ing wildlife species occurrence. These methods include 
logistic regression (Straw et al. 1986; Nadeau et al. 
1995), discriminant function analysis (Livingston et 
al. 1990; Fielding and Haworth 1995), Mahalanobis 
distance statistic (Clark et al. 1993b; Knick and Dyer 
1997), classification and regression tree analysis 
(O’Connor et al. 1996), cumulative distribution analy- 
sis (Dettmers and Bart 1999), and principal compo- 
nent analysis (Debinski and Brussard 1994). However, 
models produced with these techniques often are not 
validated over large extents or they perform poorly 
when tested (Dedon et al. 1986; Raphael and Marcot 
1986; Johnson et al. 1989). Furthermore, these tech- 
niques are rarely compared with one another to pro- 
vide information on their relative ability to produce 
accurate predictive models. Little information exists 
regarding whether some of these techniques are more 
appropriate for certain types of modeling applications 
beyond what the statistical constraints of the data 
might be. 
In this study, we developed habitat models for pre- 


dicting the occurrence of breeding birds in the south- 
ern Appalachian Mountains region using several dif- 
ferent modeling techniques. We present validation re- 
sults on the ability of the different models to 
accurately predict bird occurrence at two different lo- 
cations within the region. We used logistic regression, 
Mahalanobis distance method, classification and re- 
gression tree analysis (Breiman et al. 1984; Clark and 
Pregibon 1992), and discriminant function analysis to 
develop predictive models based on bird survey data 
in Tennessee, and we tested the models on similar sur- 
vey data from Georgia and Virginia. 


Study Area 


Our study area was the same as the area chosen for 
the Southern Appalachian Assessment (SAMAB 
1996), which extends from western Virginia south to 
northern Georgia (Fig. 54.1) and encompasses over 22 
million hectares. An extensive spatial database, in- 
cluding land-cover types and landscape metrics, was 
assembled for the Southern Appalachian Assessment 
(SAA). We used this collection of spatial data as the 
source of predictor variables for our habitat models. 


Bird Data 


We used data collected from three point-count survey 
projects on three national forests within our study 
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Kentucky 


E National Forest 


Southern Appalachian 
Assessment Area 


640 Km 
Figure 54.1. Boundaries for the Southern Appalachian As- 
sessment and the three national forests from which bird sur- 
vey data were collected. 


area: the George Washington and Jefferson National 
Forests in Virginia, Cherokee National Forest in Ten- 
nessee, and Chattahootchee National Forest in Geor- 
gia. All point counts were conducted following the 
same protocol, which conforms to the point-count 
standards recommended by Ralph et al. (19952). The 
surveys were ten-minute, unlimited-radius point 
counts, with detections noted as either 50 meters or 
less, or more than 50 meters from the point. We used 
only the 50-meters-or-less data for the development 
and testing of models reported in this chapter. All 
counts were conducted between the hours of 0600 and 
1000. Within each national forest, the point count lo- 
cations were selected in a stratified random manner 
such that approximately equal numbers of points were 
placed in stands representing combinations of six 
major forest types (yellow pine, mixed hardwood- 
yellow pine, oak-hickory, eastern hemlock-white 
pine, cove hardwood, and northern hardwood) and 
three size-class categories (seedling-sapling, pole tim- 
ber, saw timber). Points were surveyed once annually 
between mid-May and mid-June. We used data col- 
lected from 1992 to 1996 in Georgia, where a total of 
650 points were surveyed, from 1992 to 1996 in Ten- 
nessee, where a total of 215 points were surveyed, and 
from 1993 to 1997 in Virginia, where a total of 764 
points were surveyed. 


Model Development and Testing 


We developed models for twenty-five common bird 
species, but we will present results from six representa- 
tive species (black-throated blue warbler [Dendroica 
caerulescens], black-throated green warbler [Dendroica 
virens], Carolina chickadee [Poecile carolinensis], oven- 
bird [Seiurus aurocapillus], veery [Catbarus fuscescens], 
yellow-billed cuckoo [Coccyzus americanus]), which we 
will use to generalize the results in a more manageable 
form. These six species were chosen as a representative 
sample of the entire twenty-five species set. They in- 
cluded species that nest on the ground (ovenbird), at the 
shrub and mid-story levels (black-throated blue war- 
bler, veery, yellow-billed cuckoo), in the canopy (black- 
throated green warbler), and in cavities (Carolina chick- 
adee). This group of species also had several habitat 
“generalists” (i.e., used numerous forest types: Carolina 
chickadee, ovenbird, yellow-billed cuckoo) and several 
habitat “specialists” (i.e., restricted to a small number 
of forest types: black-throated blue warbler, black- 
throated green warbler, veery). 

For each species, we used the Tennessee data to de- 
velop models for predicting species occurrence based 
on a suite of twenty-six habitat-related variables 
(Table 54.1), all of which were derived from remotely 
sensed data and taken from the spatial databases of 
the SAA (SAMAB 1996). We selected these twenty-six 
variables based partly on availability in SAA database 
and partly on past modeling experience, which sug- 
gested these variables can be useful in predicting bird 
occurrence. Using this suite of explanatory variables, 
we developed four models for each species—one from 
each of the modeling techniques we were interested in 
comparing: logistic regression, Mahalanobis distance 
method, classification and regression tree (CART) 
analysis, and discriminant function. We then used the 
Georgia and Virginia data as separate, independent 
tests of the models' ability to accurately predict the oc- 
currence of the six bird species across the Southern 
Blue Ridge physiographic province. 

For logistic regression, we used Proc Logistic (SAS 
Institute 19902) to complete a step-wise variable selec- 
tion followed by a best-subsets procedure. This proce- 
dure found the model that resulted in the lowest 
Akaike Information Criterion (AIC) value, had a 
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Predictor variables used to develop the bird-habitat models. All variables were derived from 
remotely sensed data collected for the Southern Appalachian Assessment. 


Elevation Water flow accumulation downhill 
Slope Distance from nearest stream 
Aspect Relative slope position 


Planiform curvature 
Profile curvature 
Solar exposure 


96 Forest cover—1 km 
96 Forest cover—5 km 
96 Forest cover—10 km 


Shannon-Weaver topographic index 
Topographic similarity index 
Topographic relative moisture index 
Topographic convergence index 


Forest type 

Dominant forest type—1 km 
Dominant forest type—5 km 
Dominant forest type—10 km 
Land-cover diversity—1 km 
Land-cover diversity—5 km 
Land-cover diversity—10 km 
Forest diversity—1 km 

Forest diversity—5 km 

Forest diversity—10 km 


significant fit to the data according the Hosmer and 
Lemeshow (1989) goodness-of-fit test, and provided 
the highest-overall correct classification on jackknife 
tests when sensitivity (percentage of occurrences cor- 
rectly classified) and specificity (percentage of ab- 
sences correctly classified) were equally balanced. For 
the Mahalanobis distance method, we used the same 
procedure described in Clark et al. (1993b) and then 
performed a best subsets procedure to select the model 
that resulted in the largest correct classification rate 
from jackknife tests when sensitivity and specificity 
were balanced equally. For the CART models, we fol- 
lowed the same procedure described in O'Connor and 
Jones (1996), which was to overfit an initial tree to the 
data and then prune the tree to an optimum fit based 
on cross-validation analysis, which is a form of jack- 
knife test. We used Proc Discrim (SAS Institute 19902) 
to develop discriminant functions for the presence/ab- 
sence of each species, and we used the cross-validation 
option to perform a jackknife test. We used a best sub- 
sets procedure, and the final discriminant function 
model was chosen as the one for which all variables 
remaining in the model were significant and that re- 
sulted in the greatest correct classification rate from 
the jackknife test. 

All of the forms of jackknife tests used in these dif- 
ferent methods are calculated in similar fashion by re- 
cursively removing one observation (i.e., results from 
one point count station) at a time from the total data 
set, recalculating the model parameters, testing 
whether the model correctly predicts the removed ob- 
servation, replacing that observation, removing the 


next, and so on through the entire data set. The cumu- 
lative percent correct classification over the entire data 
set provides an indication of a model's predictive abil- 
ity for a new data set. 

We tested the best models from the procedures just 
described by using the Georgia and Virginia data sets 
as independent tests. For each of these data sets, we 
had global positioning system (GPS) locations for the 
point count stations. We used these locations to sam- 
ple the SAA spatial data sets containing the explana- 
tory habitat variables and to obtain the habitat values 
associated with each point count location. We then 
ran the various models on these habitat data sets to 
get predictions of presence/absence (CART and dis- 
criminant function) or probability of occurrence (lo- 
gistic regression and Mahalanobis distance). For all 
four of the modeling methods, we calculated the per- 
cent correct classification of the observations in each 
of the new data sets (Georgia and Virginia). We 
termed this procedure the classification test. For logis- 
tic regression and the Mahalanobis distance method, 
we had to select a cut-off probability value in order to 
convert the probabilities of occurrence predicted from 
these methods into predictions of presence/absence. 
We used the same cut-off values that resulted in 
the best models from the model development phase 
using the Tennessee data set. These cut-off values were 
selected to achieve the best overall correct classifica- 
tion rate for the jackknife tests on the Tennessee data 
while maintaining a balance between sensitivity and 
specificity. 

In addition to calculating the percent correct 
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TABLE 54.2. 


Percentage of correct classifications for jackknife tests on the Tennessee avian point count data for species- 
specific models developed using four different modeling methods. 


% points Logistic Mahalanobis Discriminant 

Species occupied? regression distance CART? function 
Black-throated blue warbler 

Dendroica caerulescens 26 85.7 84.3 90.7 94.4 
Black-throated green warbler 

Dendroica virens 48 70.7 97 80.9 84.2 
Carolina chickadee 

Poecile carolinensis 44 58.6 56:3 80.5 94.9 
Ovenbird 

Seiurus aurocapillus 53 68.4 61.4 noi 91.2 
Veery 

Catharus fuscescens dL 94.4 94.4 95.8 96.7 
Yellow-billed cuckoo 

Coccyzus americanus 19 71.6 66.5 87.9 90.2 


aA total of 215 points were surveyed. This column represents the percentage of the total sample of points in Tennessee on 


which a species was detected. 
Classification and regression tree analysis. 


classification using these methods, we also used the ac- 
tual probabilities of occurrence predicted from logistic 
regression and Mahalanobis distance to test the useful- 
ness of these models in another manner. For this sec- 
ond test, we used logistic regression analysis to test for 
a significant relationship across an entire data set be- 
tween the model-predicted probabilities of occurrence 
for each location and the observed presence/absence at 
that location. We termed this procedure the association 
test. A significant relationship (P < 0.05) in the correct 
direction indicated that the models were performing 
well because higher predicted probabilities of occur- 
rence were associated with locations where the given 
species was actually observed to occur. For logistic re- 
gression, the correct direction of this relationship was 
positive (higher predicted probabilities of occurrence 
associated with observed occurrence). The correct di- 
rection was negative for the Mahalanobis distance 
models because we used the actual Mahalanobis dis- 
tance values, for which smaller values indicated a 
higher probability of occurrence (Clark et al. 1993b). 


Model Development Results 


Results from the jackknife tests on the data set used 
for developing. the models showed variability, both 


among species and among modeling techniques (Table 
54.2). These results provide an indication of how well 
the models should perform when tested on new data. 
Discriminant function models performed the best on 
the jackknife tests for all species, correctly classifying 
over 80 percent of the observations in the Tennessee 
data set for all species and over 90 percent for many 
of the species. However, this high rate of correct clas- 
sification is somewhat misleading, because the jack- 
knife test of the discriminant function procedure does 
not balance sensitivity and specificity but rather seeks 
the best total correct classification. CART models also 
performed well on the jackknife tests, with results that 
were better than logistic regression and Mahalanobis 
distance but slightly worse than discriminant function 
models for all species. CART models typically cor- 
rectly classified 80 percent or more of the observa- 
tions, with some of these models correctly classifying 
more than 90 percent of the observations. The logistic 
regression and Mahalanobis distance models did not 
perform as well on the jackknife test for all species, al- 
though they both did well (over 80 percent correct) on 
black-throated blue warbler and veery (species that 
are restricted to higher elevations in our study area). 
The logistic regression models also correctly classified 
over 70 percent of the observations for black-throated 
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green warbler and yellow-billed cuckoo. Both model- 
ing techniques produced models that correctly classi- 
fied less than 70 percent of the Tennessee observations 
for ovenbird and less than 60 percent for Carolina 
chickadee, suggesting that the models for these two 
species, which tend to be widely distributed across for- 
est types and stand conditions, were not likely to be as 
successful at accurately predicting presence/absence in 
new locations. 


Model Testing Results 


As an example, we provide a graphical representation 
of output from the four models in comparison to ob- 
served field data for yellow-billed cuckoo in the Deer- 
field Ranger District of the George Washington and 


TABLE 54.3. 


611 


Jefferson National Forests in Virginia (see Fig. 54.2 in 
color section). Although only one species and a small 
portion of the Virginia test data used are shown, this 
figure provides a visual reference for how the test- 
data points were spatially configured and what typi- 
cal model output was like for each of the modeling 
techniques. 

A noticeable decrease occurred in the percentage of 
observations in the Georgia and Virginia test data sets 
that were correctly classified by the discriminant func- 
tion and CART models compared to their level of per- 
formance on the jackknife tests with the Tennessee 
data. The discriminant function models, in particular, 
performed much worse on the test data sets and did 
not achieve a correct classification rate over 70 per- 
cent for any of the species (Table 54.3). The CART 


Results from testing bird-habitat models for six species on two independent data sets. The models were developed using four 
different modeling methods and covered the southern Appalachian Mountains. 


Logistic regression 


CART? DF^ 


% correct % correct 


Mahalanobis distance 


Species^ % correct Sign Wald x? P value % correct Sign Wald x? P value 

Georgia 

BTBW 65.5 + 36.49 0.001 652 — 17.64 0.001 92.0 65.6 
BTNW 54.8 + 11.48 0.001 GTA — 16.38 0.001 55.4 PX AL 
CACH 44.8 — 4.85 0.028 45.7 + 5.69 0:015 49.2 26.5 
OVEN 52.9 3 0.86 0.354 54.9 — 357 0.059 Soe 40.9 
VEER? 

YBCU 61.8 + 15.58 0.001 34.0 + 1.09 0.298 54.6 Se 
Virginia 

BTBW 65.4 ar 25.18 0.001 58.6 — ewi 0.009 94.9 56.7 
BTNW 54.6 5.96 0.015 61.4 — 7.99 0.005 67.1 36515 
CACH 44.5 — 0.01 0.907 15222 — 0.06 0.802 5077 23.6 
OVEN bs + 1.05 0.305 562 — 4.18 0.041 56.0 34.6 
VEER 62.9 + 9.19 0.002 62.4 — 15.84 0.001 85.6 62.4 
YBCU 61.8 + 24.25 0.001 552 — 152 0.217 67.3 4T.1 


Note: The final models developed from each of the modeling methods were tested on similar point count data sets from Georgia and Virginia. “% 
correct" indicates the percentage of observations in the test data sets that were correctly classified by the model for a species. The rest of the 
statistics indicate the strength of the relationship between the probability of occurrence predicted by a model and the observed occurrence at each 
point count location. Logistic regression was used to test for the significance of these relationships. We considered a significant positive 
relationship to be an indication that logistic regression models performed acceptably well and a significant negative relationship to be an indication 


of acceptable performance for the Mahalanobis distance models. 
aClassification and regression tree analysis. 
bDiscriminant function analysis. 

cSpecies codes (see Table 54.2 for scientific names): 
chickadee, OVEN = ovenbird, VEER = veery, YBCU = yellow-billed cuckoo. 
dAn insufficient number of veeries were detected on the Georgia sites (N 


BTBW = black-throated blue warbler, BTNW = black-throated green warbler, CACH = Carolina 


= 4) for testing the models of this species. 
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models performed well (over 85 percent correct) on 
the classification tests for black-throated blue warbler 
and veery, which were the species restricted to high 
elevations and thus should be the ones most easily 
predicted. 

The logistic regression and Mahalanobis distance 
models generally performed rather poorly (less than 
70 percent correct) on the classification tests for all 
species but performed quite well on the association 
tests for a majority of the species (Table 54.3). Black- 
throated blue warbler, veery, and yellow-billed cuckoo 
were the only species for which the logistic regression 
models correctly classified 60 percent or more of the 
observations in the Georgia and Virginia data sets. 
Black-throated blue warbler, black-throated green 
warbler, and veery were the only species for which the 
Mahalanobis distance models achieved more than 60 
percent correct classification on the test data sets. 
However, since both of these modeling techniques are 
more appropriate for predicting probabilities of occur- 
rence than presence/absence, we expected these two 
methods to perform better on the association tests. 
The results from the association tests indicated that 
the logistic regression models for black-throated blue 
warbler, black-throated green warbler, veery, and 
yellow-billed cuckoo performed well (Table 54.3). The 
Mahalanobis distance models for black-throated blue 
warbler, black-throated green warbler, ovenbird (mar- 
ginally on the Georgia data set), and veery performed 
well on the association tests. Neither the logistic re- 
gression nor the Mahalanobis distance models for 
Carolina chickadees performed well on the association 
tests or the classification tests. 


Comparison of Model Performance 


Discriminant function models for all the species ap- 
peared to perform very well when assessed by jack- 
knife tests, but they performed very poorly for all 
species when their predictive ability was tested on in- 
dependent data. All of the discriminant function mod- 
els correctly classified over 90 percent of the observa- 
tions in the development data set using the jackknife 
test, with the exception of the black-throated green 
warbler model, which still performed well (over 80 
percent correct}. However, when these models were 


tested on data from Georgia and Virginia, their per- 
formance dropped to less than 70 percent correct for 
all of the species tested, with most species less than 50 
percent. We believe this poor performance on the clas- 
sification test resulted from discriminant function 
models being chosen on the basis of the best overall 
classification rate, rather than on the basis of balanc- 
ing sensitivity and specificity, resulting in models bi- 
ased toward either presences or absences and thus un- 
able to accurately predict new observations. 

Results of the classification tests were mixed for the 
three other modeling techniques. CART models con- 
sistently performed similarly or better than the other 
models on the classification tests for both the Georgia 
and Virginia data sets across all the species. Perfor- 
mance of CART models for black-throated blue war- 
bler (over 90 percent correct classification for both 
test data sets) and veery (85.6 percent correct classifi- 
cation on the Virginia data) was exceptionally good 
and considerably better than logistic regression and 
Mahalanobis distance models (65 percent or less cor- 
rect classification). Since the occurrence of these two 
species was restricted to higher elevations and could 
be predicted fairly well by elevation alone, our results 
suggest CART might be the most suitable method for 
modeling species whose occurrences are distributed in 
an approximately binomial fashion with regard to a 
key environmental factor (e.g., restricted to occurring 
above a given elevation). 

Performance on the classification tests was consid- 
erably lower and very similar among these three meth- 
ods (logistic regression, Mahalanobis distance, and 
CART) for the rest of the species tested in this chapter, 
with none of the models for the remaining species cor- 
rectly classifying over 70 percent of either test data set 
(and most less than 60 percent). Our results suggested 
that for these three modeling methods, their ability to 
correctly predict presence/absence might be fairly sim- 
ilar for many species. However, the CART method 
was the only one that proved capable of highly accu- 
rate models (over 80 percent correct) for at least some 
species. 

Correct classification of 75-80 percent has been 
suggested as a level of accuracy that both researchers 
and managers consider to be acceptable levels of model 
performance (Chalk 1986; Hurley 1986). Our models 
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for most of the species fell somewhat below this level 
of performance for the classification tests. However, 
Morrison et al. (1992:258-262) cautioned that even 
good habitat models typically account for less than 50 
percent of the variation in species occurrence. Most of 
our models performed at least this well. 

One of the factors likely contributing to the low 
levels of model performance was our testing of Ten- 
nessee-based models on data from Georgia and Vir- 
ginia. Although these locations all fell within the 
southern Appalachian region, some of the natural 
variation among these locations undoubtedly was not 
incorporated in a model based solely on the data from 
Tennessee. Thus, the somewhat low levels of perform- 
ance for many of the models were not entirely unex- 
pected, as species responses to habitat conditions cer- 
tainly could vary across this large geographic region, 
especially along the north-south gradient. As sug- 
gested by Heglund (Chapter 1), where steep environ- 
mental gradients exist, stratified surveys along the gra- 
dients are useful for capturing the variability in species 
responses and reducing bias to responses at more local 
scales. The data from Georgia, Tennessee, and Vir- 
ginia represent a stratified sample of the north-south 
gradient, so our models for the southern Appalachians 
might be improved by developing the models from 
data across the entire region. Combining all the data 
from the three locations and then dividing the entire 
set into training and testing groups may have im- 
proved the predictive ability of the models across the 
region. 

Due to the subjectivity involved in selecting cut-off 
values to predict presence/absence (see further discus- 
sion to follow) from the probabilities of occurrence 
generated by logistic regression and Mahalanobis dis- 
tance models, we considered the association tests to be 
a more appropriate method of assessing the perform- 
ance of models developed from these two methods. 
For the association tests, a significant association (P « 
0.05) in the correct direction between the predicted 
probabilities of occurrence (or predicted distance from 
the ideal value, in the case of Mahalanobis distance 
models) and the observed presence or absence across 
all samples in the test data sets suggested the models 
performed well at what they were designed to do— 


predict which locations were more likely to have the 
species of interest present. 

Based on the association tests, both the logistic re- 
gression and Mahalanobis distance models performed 
well (P < 0.05) for most of the species tested. Oven- 
bird and Carolina chickadee were the only species for 
which the logistic regression method did not perform 
well, while yellow-billed cuckoo and Carolina chick- 
adee were the only species for which the Mahalanobis 
distance method did not perform well. As suggested 
by the large Wald x2 values, both methods produced 
models that performed very well for black-throated 
blue warbler and veery, which are species restricted to 
higher elevations in the Southern Appalachians, as 
well as black-throated green warbler. The logistic re- 
gression model for yellow-billed cuckoo also per- 
formed well, while the Mahalanobis distance model 
for ovenbird was acceptable, but neither method pro- 
duced good models for both of these species. Develop- 
ing models that perform well for these two species 
could be difficult because they tend to be fairly general 
in their habitat associations and thus occur in a wide 
variety of forest types and condition classes. We have 
no clear explanation why the different methods would 
produce a model that worked well for one of these 
species but not the other. Carolina chickadee was an- 
other species that was widespread and occurred in a 
variety of forest types and condition classes, but nei- 
ther of the methods produced a model that performed 
well for that species. 


Analytical Method 


The classification tests were perhaps a conservative 
means of testing the predictive abilities of the logistic 
regression and Mahalanobis distance models. We con- 
sider the need to choose a cut-off probability level, 
above which the model output is considered to be 
equivalent to a prediction of presence, to be a limiting 
factor in the correct prediction of presence/absence, 
particularly for models that are designed to predict the 
probability of occurrence rather than absolute pres- 
ence or absence. No standardized rules exist for 
choosing a cut-off value, and the cut-off value ulti- 
mately influences the total correct classification rate, 


614 PREDICTING SPECIES OCCURRENCES 


thus adding an element of subjectivity to the results of 
classification tests. 

A common approach to choosing a cut-off value is 
to select the probability value that results in a balance 
between the sensitivity of the model (proportion of the 
*presence" observations correctly predicted) and the 
specificity of the model (proportion of the *absence" 
observations correctly predicted). Depending on the 
balance between the number of occurrences and ab- 
sences in the data, this approach to selecting the cut- 
off value may not result in the highest-possible total 
correct classification rate. For instance, if the data set 
includes few occurrences and many absences, then se- 
lecting a cut-off value that maximizes specificity 
would result in a much higher total correct classifica- 
tion than selecting the cut-off value that balances sen- 
sitivity and specificity. Thus, our method of always se- 
lecting the cut-off value that balances sensitivity and 
specificity often resulted in a total correct classifica- 
tion that was below what the model could achieve. 

We believe that balancing sensitivity and specificity 
is the most appropriate method because we are inter- 
ested in maintaining reasonable levels (i.e., over 50 
percent) of sensitivity and specificity for all models, re- 
gardless of the distribution of occurrences and ab- 
sences in the data sets for a given species. If the distri- 
bution of occurrence is skewed and the cut-off value is 
chosen to maximize either sensitivity or specificity, 
then the other measure will usually be extremely low. 
Furthermore, maximizing either sensitivity or speci- 
ficity to achieve a high correct classification of the de- 
velopmental data is likely to bias the models and limit 
their ability to make accurate predictions for new lo- 
cations, as suggested by the results from testing the 
discriminant function models. 


Management Implications and 
Conclusions 


Our development and comparison of modeling meth- 
ods was part of a large effort to provide a series of re- 
gional wildlife-habitat models for the southern Ap- 
palachians. The U.S. Forest Service plans to use these 
models in an effort to coordinate national forest plan- 
ning across this region. Having bird models developed 
from the same: set of habitat data across the entire re- 


gion will allow the Forest Service to consistently assess 
how alternative management options for the different 
national forests will affect the availability of bird habi- 
tat across the region. The results from this study indi- 
cate that reasonable models can be developed for 
some species over large spatial extents. Thus, in areas 
where resource management agencies have an interest 
in coordinating planning efforts over large areas, use- 
ful predictive models can be developed to assist in 
evaluating management alternatives. 

Our results suggest several generalizations regarding 
the ability of the four modeling methods compared in 
this study to produce good predictive wildlife-habitat 
models for large spatial extents when using data simi- 
lar to those used in this study. First, while discriminant 
function models are likely to distinguish between pres- 
ence and absence locations within the original data set, 
such models are unlikely to perform well when tested 
on new data sets. Second, both logistic regression and 
Mahalanobis distance models appear to be good gen- 
eral methods for predicting probabilities of occurrence 
for new locations, although they may not perform as 
well predicting absolute presence/absence. These two 
methods performed well on our association tests for 
most of the species that we evaluated. CART analysis 
may be the best method when the correct prediction of 
species presence and absence at given locations is the 
desired product from the models. 

The difficulties in developing successful predictive 
models for some species were also clear from our re- 
sults. Generalist species that occur over a wide range 
of habitat types seem to be particularly problematic 
(Hepinstall et al., Chapter 53). The models for some 
species in our study might have been improved by de- 
veloping the models with data from across the entire 
region rather than just from Tennessee, but some gen- 
eralist species may simply be difficult to model. 
Species that are more limited in their habitat utiliza- 
tion (e.g., elevation, forest type, condition class) are 
more likely to be modeled well. This trend was espe- 
cially true for the CART models, which performed ex- 
tremely well on the classification tests for the species 
that were elevation limited (e.g., black-throated blue 
warbler, veery) but did not perform very well on the 
species that were more general in their habitat use. 
Both logistic regression and Mahalanobis distance 
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models performed well on our association tests for 
many of the species and thus both should be consid- 
ered good options for general wildlife-habitat 
modeling. Logistic regression is a more completely 
developed technique that has been extensively de- 
scribed by statisticians (e.g., Hosmer and Lemeshow 
1989), including appropriate methods for model selec- 
tion and determining the goodness-of-fit of models. 
The Mahalanobis distance method is not yet a com- 
monly used method and lacks well-defined procedures 
for model assessment and determination of significant 
variables in a given model. 

Finally, while the test results from our models may 
not have been as good as we would have liked for 
many of the species, our results indicate that accept- 
able predictive models covering large spatial extents 
can be developed, at least for some species. For the 
species in our study that were sufficiently restricted to 
specific habitat components (e.g., northern hardwood 
forests associated with high elevations in the moun- 
tains), we were able to develop models from data col- 


lected in Tennessee and then correctly predict the oc- 
currence of those species in Georgia and Virginia with 
a high degree of accuracy. Thus, some species will lend 
themselves to this kind of modeling process much bet- 
ter than others will, but our results show that success- 
ful modeling of this type is possible over large extents. 
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Distributional Prediction Based on Ecological 
Niche Modeling of Primary Occurrence Data 


A. Townsend Peterson, David R. B. Stockwell, and Daniel A. Kluza 


NC of biodiversity is incomplete (Wilson 
1988). The magnitude of numbers of species 
(106—107) combined with relatively small numbers of 
systematic biologists, suggest that scientific documen- 
tation of world biodiversity (currently about 3x10? 
specimens) is fragmentary (Anonymous 1994). Explo- 
rations of the completeness of biodiversity data on re- 
gional scales (e.g., Peterson et al. 1998) have found J- 
shaped frequency distributions—only a few species or 
sites are well documented, and most are poorly known 
and underdocumented. 

Many broadly based biodiversity programs, such as 
the U.S. National Gap Analysis Program and others 
(e.g., ICBP 1992; Scott et al. 1996), focus on 
geographic distributions of species. Predictions of geo- 
graphic distributions, however, depend on the incom- 
plete sampling available and therefore would be im- 
proved by some inference or modeling. Inferential 
procedures offer the possibility of using existing knowl- 
edge to predict into the gaps in knowledge. The nature 
of these inferential procedures, which carry their own 
assumptions and biases, affects the characteristics of 
any products of such programs. 

The purpose of this chapter is to explore some of 
the base constructs underlying modeling procedures 
(algorithms) used to infer species' distributions from 
incomplete data and to make recommendations re- 
garding these algorithms. Throughout this discussion, 


we distinguish between primary information (direct 
observation or documentation, often in the form of a 
specimen, of a particular species taken at a particular 
place and time) and secondary information (a synthe- 
sized product based on primary information, often in 
the form of a species biography, range map, or de- 
scription of habitat use). This distinction offers in- 
sights into the types of algorithms that are desirable 
for such analyses. 


Distributional Modeling 


The procedure used to convert primary distributional 


data (i.e., known points of occurrence) into spatially 
continuous information (i.e., predictions of presence 


and absence across a landscape for a particular 
species) is critical to the value of biodiversity 
products. These procedures, however, are not all 
alike, and the differences among them have impor- 
tant implications. Understanding these differences 
may allow development of criteria for improved 
methodologies. 


One or Two Steps 


A first choice is whether the algorithm to be employed 
is of one or two steps: one-step approaches (e.g., Hol- 
lander et al. 1994) focus on modeling the geographic 
distribution directly from the spatial arrangement of 
known occurrence points. Two-step approaches (e.g., 


oq 


Con 


~ 
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Austin et al. 1990) attempt to model the ecological re- 
quirements of species (their ecological niches) and 
project those requirements onto maps to produce a 
potential geographic distribution. 

One-step models are convenient because they focus 
directly on prediction of known geographic distribu- 
tions and are often computationally much less expen- 
sive. However, they usually require restrictive assump- 
tions regarding existing information, such as that 
sampling is sufficient to represent geographic limits, or 
that sampling is random or uniform such that ranges 
may be estimated from central tendencies (Hollander 
et al. 1994). These assumptions are rarely evaluated 
prior to application of such approaches (Csuti 1996) 
and are almost never realistic given the odd character- 
istics of existing biodiversity information—many 
species and sites are poorly sampled, and few species 
and sites are well sampled (see Peterson et al. 1998c 
for a quantitative analysis). Their weakness, to be 
more precise, lies in the fact that no special quality of 
a species is linked to its known geographic distribu- 
tion, and that species’ true geographic distributions re- 
sult from complex interactions of ecological and his- 
torical factors (Peterson et al. 1999). 

Two-step models, on the other hand, offer a direct 
tie to species’ biology. An ecological determinant of a 
species’ distribution, its ecological niche (MacArthur 
1972b), can be modeled to hypothesize environmental 
conditions under which it is likely to be able to main- 
tain populations. Then, geographic areas with those 
conditions can be identified as a potential distribution 
for the species. This procedure could potentially over- 
come some of the problems of sampling bias given that 
the ecological niche may be identifiable without a com- 
plete or balanced sampling of the species’ geographic 
distribution. In essence, the question of bias in sam- 


pling passes from geographic space, where spatial bi- 
ba 


ases are well known (e.g., Peterson et al. 1998c), to 
ecological space, where bias may be reduced. A niche 
model also allows additional inferences, such as 
changes in the species’ potential distribution under sce- 
narios of environmental change (Peterson et al. 2001). 


Error Components and Decision Criteria 
In developing a model predicting a species’ geographic 
distribution, two types of error are possible. Omission 


is leaving out areas actually inhabited, whereas com- 
mission is including areas not actually inhabited 
(Scott et al. 1993; Krohn 1996). Although simple solu- 
tions can minimize either (predicting the whole map as 
present reduces omission error to zero; predicting only 
the known occurrence points reduces commission 
error to zero), an ideal algorithm would minimize 
both simultaneously. 

A typical, testable implementation of ecological or 
distributional modeling procedures involves subsetting 
available information to provide tests of model accu- 
racy. For instance, a model can be developed based on 
60 percent of the distributional points available 
(“training data”) and tested based on the remaining 
40 percent (“test data”), or a standard number of 
points can be chosen at random and set aside for 
model tests (Fielding and Bell 1997); similar test data 
sets can be developed for absence based on known ab- 
sences or on points chosen from the background 
(Stockwell and Peters 1999). In this case, omission 
error is directly estimable as the proportion of the test 
presence data set not predicted as “present.” Commis- 
sion errors are more problematic with presence-only 
data, as no direct estimator is available (Krohn 1996). €— 
The overall area of the distribution predicted includes 
both actual distributional area (unknown but real, 
and generally larger than the known distribution) and 
the commission error component. Under this view, ge- 
ographic predictions of smaller areas will generally be 
better at reducing commission error. Hence, a reason- 
able decision rule is to reduce predicted areas as much 
as possible (reducing commission error) without in- 
creasing omission error. 

Given these arguments, the following ranked crite- 
ria are recommended for evaluating the potential of 
inferential approaches to modeling species’ geographic 
distributions: 


1. Minimize omission. Minimal omission of known 
occurrence points. 

2. Minimize commission. Find the smallest distribu- 
tional prediction that does not fail on Criterion 1. 

3. Emphasize data economy. With insensitivity to 
small sample sizes, elimination of potential biases 
in geographic or ecological coverage, and ability to 
use inexpensive data (i.e., no detailed physiological 
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or bibliographical data necessary), modeling can be 
applied to distributions of large numbers of 
species, not just the small, well-studied minority. 

4. Model the niche. Although not a requirement for 
many simple applications, development of ecologi- 
cal niche models makes possible extension of bio- 
diversity analyses to more complex situations, such 
as scenarios of climate change, species invasions, 
human activity shifts, and the like. 

5. Check repeatability and empbasize renewability. 
The ability to replicate a particular result via 
known, quantitative procedures lends scientific 
credibility to products. Especially useful would be 
approaches that are perpetually renewed and thus 
do not get old or out of date (e.g., published faunas 
or floras). 


Although relative importances of these criteria may 
vary according to their associated costs, the above cri- 
teria offer a general framework within which potential 
approaches may be viewed. 


The Primacy of Point Occurrence Data 


Throughout this discussion, the importance of basing 
biodiversity products (synthetic analyses such as gap 
analyses, conservation prioritizations, and richness 
maps) on primary point occurrence data is empha- 
sized. Secondary data are often in temptingly conven- 
ient form—published range maps, species accounts, or 
ecological summaries. However, secondary data usu- 
ally carry with them several problems. First, they in- 
sert elements of subjectivity—development of a range 
map usually involves an expert drawing a polygon or 
polygons around known areas of occurrence and 
guessing at which of the inevitable unsampled areas 
are likely to hold or not to hold populations. Second, 
availability of secondary data depends on publication 
of knowledge and for that reason often lags behind 
the existence of that knowledge. Data for particular 
taxonomic groups or regions may exist, but secondary 
data may be unavailable (compare numbers of publi- 
cations on birds with those on nematodes, or publica- 
tions about Massachusetts with those about Vietnam). 

Finally, and perhaps most critically, secondary data 
are usually one-time products (e.g., publication of a 


book), or at best products that are renewed or revised 
periodically, which makes the biodiversity products 
based on them degrade over time. For example, if 
range maps published in a regional field guide are used 
to develop a biodiversity product, those maps were 
probably somewhat out of date even when the guide 
finished the yearlong publication process—habitat 
modification may have extinguished populations or 
permitted invasion of new areas, or new distributional 
points may have been discovered. Publication of such 
a regional summary usually stimulates a rapid spate of 
additions and modifications to the state of knowledge 
as well, making the range maps degrade in quality and 
completeness even more rapidly. The biodiversity 
product degrades along with the underlying informa- 
tion, and soon is out of date. 

Primary data, on the other hand, offer solutions to 
most of these problems. These data, including occur- 
rences vouchered by scientific specimens as well as 
observational data, are abundant (Peterson et al. 
1998c). Although some primary data are decades or 
even centuries old, and precision of locality informa- 
tion is clearly reduced with older records, when 
modeling procedures are properly designed, model 
results may have excellent predictive abilit 


(Godown and Peterson 2000, Peterson et al. 2002, e— 


Peterson 2001). Moreover, primary data, when ag- 
gregated from world holdings, are renewable and im- 
prove over time, as more and more information is 
added to the storehouse. 

Recently, a multidisciplinary, inter-institutional ef- 
fort has laid a technological and political foundation 
for making primary biodiversity data from world sci- 
entific institutions available for such applications. The 
North American Biodiversity Information Network of 
the Commission for Environmental Cooperation 
(Montreal, Canada) and the National Science Founda- 
tion have funded the development of an Internet- 
based distributed database system called The Species 
Analyst (speciesanalyst.net). This facility uses the 
739.50 information transfer protocol to integrate in- 
stitutional holdings of primary biodiversity data into a 
virtual *world database" that provides the informa- 
tion on which various biodiversity products can be 
based. The system presently serves about fifteen mil- 
lion data points housed at twenty-two institutions and 
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provides a direct connection to facilities for analysis of 
data via the tools described below. A combination of 
this distributed store of primary biodiversity data and 
rapid computational approaches is expected to pro- 
vide continually renewed and improved analytical 
abilities. 


Ecological Niche Modeling and 
Distributional Prediction 


The niche, as we use the term, is a set of tolerances 
and limits that define where a species is potentially 
able to maintain populations (Grinnell 1917), later 
visualized as an N-dimensional polyhedron in eco- 
logical space (Hutchinson 1957). Given the spatial 
scale of our analyses, we focus on niche dimensions 
relevant to geographic distributions rather than to 
local distributional issues such as microhabitat or 
substrate selection. Hence, the niche dimensions em- 
ployed are those usually considered in geographic 
limitation of species—temperature, precipitation, ele- 
vation, vegetation, and so forth, and not the finer- 
resolution information that may be necessary to re- 
solve more detailed issues, such as local substrate 
selection. 

Ecological niches are generally divided into funda- 
mental and realized ecological niches: the former 
represents the base ecological capacity of the species, 
and the latter incorporates the effects of interactions 
with other species (MacArthur 1972b). Because com- 
munity composition varies greatly over space, its im- 
pacts should vary as well (Dunson and Travis 1991). 
Models at the level of entire species’ distributions 
may allow identification of broader ranges of envi- 
ronments potentially suitable for a species, analo- 
gous to a fundamental niche, although on coarse 
spatial scales. In some senses, most of the approaches 
used in distributional modeling approximate a niche 
as they are defined in multidimensional ecological 
space; however, the one-step approaches discussed 
above intermingle modeling of the fundamental and 
realized niches and therefore stray from a simple list 
of environmental factors important to a species' 
presence or absence. 

Several two-step approaches have been used to ap- 
proximate species’ ecological niches; three are treated 


herein. The simplest is BIOCLIM (Nix 1986), which 
involves tallying species’ occurrences in categories 
along each environmental dimension, trimming mar- 
ginal portions of distributions, and taking the niche as 
the conjunction of the trimmed ranges (e.g., annual 
mean temperature between 4°C and 6°C, annual mean 
precipitation between 1,000 and 1,200 millimeters, 
etc.). Although easy to implement and conceptually 
attractive, BIOCLIM suffers reduction in efficacy 
when many environmental dimensions are included 
(D. R. B. Stockwell unpublished data; B. Loiselle per- 
onal communication). 

A second class of approaches is based on logistic 
multiple regression, a class of statistical techniques 
aimed at predicting the probability of “yes” versus 
“no” in the independent variable (e.g., Mladenoff et 
al. 1995). Although complex in its application to bio- 
diversity data for which absence data are not com- 
mon, thus requiring sampling of the background land- 
scape as a substitute, this idea combines well with the 
concept of physiological tolerances determining 
species’ presences along continuous climate axes. It, 
however, is less well suited to incorporation of cate- 

-gorical information (e.g., vegetation type, soil type). In 

effect, logistic regression divides environmental space 
into two portions (“habitable” and “uninhabitable”) 
at a particular probability threshold, an approach that 
may be useful under some circumstances. More-recent 
implementations of this approach have included im- 
provements such as relaxation of assumptions regard- 
ing distributions of errors in the regression (e.g., 
Austin et al. 1990). 

Finally, the Genetic Algorithm for Rule-set Predic- 
tion (GARP) includes simplified versions of both of 
the methods described above, as well as other set- 
based approaches in an iterative, artificial-intelligence 
approach (Stockwell and Noble 1992; Stockwell and 
Peters 1999; http://biodi.sdsc.edu/). Individual algo- 
rithms (e.g., BIOCLIM, logistic regression) are used to 
produce component “rules” in a broader rule-set, and 
hence portions of the species’ distribution may be de- 
termined as inside or outside of the niche based on dif- 
ferent rules from several algorithms. As such, GARP is 
a superset of the other approaches and should always 
perform as well as or better than any one of them. Ex- 
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tive ability and relative insensitivity to small sample 
sizes—even down to 10-20 sample points (Peterson et 
al. 2002; Peterson 2001; Stockwell and Peterson in 
press)—and insensitivity to BIOCLIM's problems with 
environmental data density (Peterson and Cohoon 
1999), although precision of predictions clearly de- 
pends on the minimum resolution available in both 
environmental and species’ occurrence data (Peterson 
and Kluza unpublished data). 

GARP models generally predict geographic distri- 
butions of species accurately (Peterson et al. 1999; 
Peterson and Cohoon 1999; Peterson et al. 2002; Pe- 
terson 2001). Within the region in which a particular 
species occurs, it does an excellent job of including 
and excluding areas that are inhabited and not. Two 
phenomena cause deviations from accurate predic- 


tions. The first involves insufficient data dimensions 
Dr Up D aent icienty (Peterson and 


Cohoon 1999). The effect of including too few envi- 
ronmental variables in a modeling effort is that fac- 


tors critical to limiting species' distributions in space 
ay be omitted, and for that reason predicted geo- 


graphic distributions are too large. Tests of these 
methodological challenges indicate that most models 
stabilize with four to five environmental dimensions 
(Peterson and Cohoon 1999); however, clearly, eco- 
logical models will be more predictive with more and 
more environmental dimensions that are included in 
the analysis. 


The second type of error in GARP involves predic- 


tion of occurrence in biogeographic regions not in- 
habited by the species in question. Although initially 
worrisome, this phenomenon makes considerable 
sense: species are limited to particular geographic 
areas not solely by ecological factors (modeled by 
GARD), but also by historical phenomena. These his- 
torical factors include speciation events (producing 
sister species in the *other" area), extinction events, 
and limited colonization ability (why are humming- 
birds not in Africa?), as well as possible historical 
species’ interactions (R. Anderson et al. submitted). 
In fact, a recent study demonstrated that ecological 
niches of many species accurately predict geographic 
distributions of their allopatric sister species because 
ecological niches apparently evolve more slowly than 
speciation events occur (Peterson et al. 1999). The 


high degree of predictability for species invasions 
achieved by applying GARP to native distributions 
and projecting to invaded ranges (Peterson and 
Vieglais 2001) is further indication that species com- 
monly do not inhabit the entire geographic extent of 
their niches. This category of error is eliminated by 
restriction to biogeographic regions known to be in- 
habited by the particular species under study (requir- 
ing assumptions about the thoroughness of sampling 
at the level of biogeographic regions), but may also 
suggest possible undiscovered isolated populations 
that should be investigated. 


A Test 


GARP is a quantitative method for modeling ecolog- 
ical niches and predicting geographic distributions 
from primary point occurrence data. This approach 
makes point-occurrence data relevant to biodiversity 
applications that require information on geographic 
distributions of species. The GARP approach is an 
alternative to the approach used (for example) in 
most Gap applications (Scott et al. 1996), so we de- 
veloped a direct comparison of the two methods; we 
report initial results herein and will provide a com- 
plete summary elsewhere (Peterson and Kluza sub- 
mitted). 

In consultation with Gap programs around the 
country, we selected the Maine Gap Analysis Program 
as an ideal testbed, being recently completed and re- 
plete with both fine-scale bird distributional data and 
relevant environmental coverages. We used stop-level 
U.S. Breeding Bird Survey (BBS) data (http://www. 
mp2-pwrc.usgs.gov/bbs/index.htm) for thirty forest 
bird species from 1990 (kindly provided by R. O’Con- 
nor and W. Krohn), combined with environmental 
coverages including elevation; slope; aspect; annual 
mean precipitation; vegetation type; and average max- 
imum, absolute maximum, average minimum, and 
absolute minimum temperatures for winter, spring, 
summer, fall, and the entire year. Twenty known oc- 
currence points for each species were set aside for test- 
ing models’ predictions of presence: the remaining 
known occurrence points were used to build models. 
All BBS points from which particular species were 
not known were used to test predictions of absence, 
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although these points clearly included some unde- 
tected presences as well. 

As an example of test results, we discuss Gap and 
GARP predictions for the wood thrush (Hylocichla 
mustelina) (see Fig. 55.1 in color section). Of the 
twenty test presence points, both Gap and GARP 
models correctly predicted eleven; of the 1,152 test 
absence points, however, the GARP model success- 
fully predicted 806, whereas the Gap model success- 
fully predicted only 471. These results indicate that 
the two approaches are generally comparable on 
false-negative (omission) error (Fielding and Bell 
1997), but GARP performed better on avoiding false- 
positive (commission) error (Fielding and Bell 1997), 
at least given the test absence data that were available 
to us. Direct statistical comparison of the perform- 
ance of the two approaches (Fielding and Bell 1997) 
indicated that the GARP model was highly statisti- 
cally significantly more accurate (McNemar’s test, 
X? = 215, P < 10-43). Indeed, of the thirty species 
tested, twenty-eight were significantly better predicted 
by GARP than Gap, in all cases with greatly reduced 
false-positive (commission) error rates (Peterson and 
Kluza submitted). 

The Maine study is a direct test of predictive effi- 
ciency using GARP and Gap methodologies. Our in- 
terpretation of its results is that Gap models identified 
species’ habitat associations reasonably well but failed 
to predict regional restriction within the state (hence 
the high false-positive error rates). GARP models, on 
the other hand, performed similarly or somewhat 
worse at identifying habitat associations but were bet- 
ter able to restrict regional distributions. Referring to 
the criteria outlined above, while GARP and Gap 
models both avoided omission (criterion 1) about as 
well (or Gap models performed slightly better), GARP 
models avoided commission considerably better than 
the Gap models (criterion 2). (It is worthy of note that 
our measures of commission error are subject to the 
criticism that our test absence data set is certainly a 
mixture of real absences and apparent absences.) 
GARP models function based on the minimum-quality 
data available (point-occurrence data) and do not re- 
quire further information (criterion 3), such as litera- 
ture on habitat preferences. GARP models produce a 
niche model (criterion 4) and are readily repeatable (at 


least in a statistical sense, as they are able to be rerun 
and reproduced at will) and renewable (criterion 5). 
Hence, we suggest further exploration of the possibili- 
ties of GARP for improving the predictivity of distrib- 
utional models used in gap analyses. 


Possibilities and Conclusions 


Development of accurate, quantitative, and repeatable 
methods for inferring ecological niches and geo- 
graphic distributions from primary point-occurrence 
data opens doors to new advances in studying biodi- 
versity. À series of products becomes possible, includ- 
ing the following: 


1. Single species distributional predictions. Distribu- 
tional predictions for single species can be useful, 
for example, in localizing rare and poorly known 
species, designing reintroduction programs, and 
protecting endangered species. 

2. Community predictions. Overlaying results for nu- 
merous species, estimation of assemblages of 
species at particular points in space becomes feasi- 
ble, providing a *rapid inventory" approach to un- 
derstanding local species distribution patterns. 

3. Conservation prioritization and impact assess- 
ment. Once local assemblages are understood, 
combinations of species present and absent at 
sites can be analyzed; if interpreted as conjunc- 
tions of species of concern, then concentrations 
represent areas for conservation, whereas areas in 
which such species are absent can be interpreted 
as areas of reduced concern (Godown and Peter- 
son 2000). 

4. Climate changelenvironmental change. Develop- 
ment of niche models opens the possibility of 
projecting an ecological niche onto other land- 
scape scenarios besides the present set of condi- 
tions; these approaches can be illuminating for 
applications such as understanding the effects of 
global climate change (Peterson et al. 2001), pre- 
dicting species invasions (Peterson and Vieglais 
2001), or projecting effects of human population 
trends. | 


Exploration of these possibilities should bring 
about a phase of rapid development of improved 
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products and qualitatively new products. Gap analy- 
sis, for example, could be moved from a one-time 
product that must be laboriously updated from time 
to time to a continuously updated, never-out-of-date, 
growing database that builds products in real time, 
thus taking advantage of a maximum of information 
for every result. 
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Statistical Mapping of Count Survey Data 


J. Andrew Royle, William A. Link, and Jobn R. Sauer 


he North American Breeding Bird Survey (BBS; 

Robbins et al. 1986) is conducted during the 
breeding season of each year by volunteer observers. 
The sampling unit in the BBS is a roadside route 24.5 
miles (39.2 kilometers) in length, containing fifty 
stops. At each stop along a route, birds are counted by 
sight and sound for a period of three minutes. Over 
four thousand routes have been surveyed in North 
America. The BBS is incapable of providing measures 
of absolute abundance because the proportion of the 
population counted (the count proportion) is un- 
known and the effective sampling area and relevant 
local populations are poorly defined. Thus, the BBS 
produces counts along transects that may be loosely 
interpreted as indices to local abundance, or relative 
abundance (Link and Sauer 1998). Although data 
from the BBS are most often used for monitoring and 
assessment of temporal trends in bird abundance, the 
data are also useful for providing information on the 
spatial distribution of relative abundance. Such infor- 
mation can be used, for example, to relate abundance 
to habitat attributes, to examine changes in the spatial 
distribution of abundance, and to assess the relative 
abundance in sparsely sampled regions. 

In this chapter, we consider the problem of spatial 
prediction of relative abundance from BBS count data. 
Nominally, our goal is mapping; that is, predicting at 
many points within a region for the purpose of pro- 


ducing a relative abundance map for a particular 
species. Mapping may be approached in a number of 
ways using various more or less ad hoc procedures 
such as inverse-distance squared interpolation, spline 
or kernel-density smoothing, and kriging (e.g., Sauer 
et al. 1995). However, we believe that mapping should 
be based on a biologically reasonable statistical 
model. Thus, we approach the mapping problem 
within a model-based framework, assuming that BBS 
route counts are observations of a spatially indexed 
Poisson random variable. 

Because count data are discrete, positive valued, 
and often exhibit strong mean-variance relationships, 
Poisson models are a natural starting point for model- 
ing such data. Our objective here is to demonstrate the 
application of a mapping procedure based on a Pois- 
son model with spatially correlated mean. We adopt 
the Poisson modeling approach proposed by Diggle et 
al. (1998), which they used for mapping radiation ra- 
dionuclide concentrations. The Poisson mean may de- 
pend on fixed covariates, as in standard Poisson re- 
gression (e.g., Jones, Chapter 35), which is common 
throughout statistics. However, because the model al- 
lows for the Poisson mean to be spatially correlated, it 
departs from the standard Poisson regression frame- 
work. It is perhaps best viewed as a Poisson-analog of 
the autologistic model for modeling a spatially 
correlated binary variable (e.g., Klute, Chapter 27). 
More generally, this model is itself a special case of the 
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generalized linear mixed models (GLMM) that have 
been widely adopted in all fields of statistical practice. 

There are several important reasons for pursuing a 
model-based strategy for mapping BBS data, and 
count data in general. First, in order to conduct for- 
mal statistical inference, a statistical model is required. 
Map-based inference problems that might be of inter- 
est include comparison of maps over time (such as to 
assess the impact of climate events or landscape 
changes); comparison of areal averages over arbitrary 
regions, thus avoiding the problem of subjective and 
ad hoc stratification; model-based design under which 
routes are added and deleted, or moved in a manner 
that is optimal with respect to a variance-based crite- 
rion; and inference about particular predicted values. 
Of course, such maps may still serve as a purely de- 
scriptive assessment of the spatial distribution of a 
bird species. Second, because count data are non- 
negative and often exhibit strong mean-variance rela- 
tionships, a proper predictor should accommodate 
these features. Putative model-free procedures do not 
address either of these concerns and can produce un- 
reasonable negative predictions. Third, model-based 
procedures produce a natural measure of prediction 
uncertainty (i.e., the prediction variance), which can 
be used to qualify prediction maps. Finally, proper 
modeling of count data allows a much more efficient 
use of data than in analyses based on reductions of 
counts to presence/absence (e.g., “range maps”), as 
pointed out by Jones et al. (2001). 

In this chapter, we propose a framework for map- 
ping count data from the BBS, based on a Poisson 
model with spatial correlation. In the next section, we 
define the general mapping problem with some discus- 
sion of the common technique known as kriging. In 
the Breeding Bird Survey Data section, we discuss 
mapping issues relevant to the BBS data. The Poisson 
mixed model is introduced in the section that follows. 
Because we adopt a Bayesian approach to analysis of 
the model, some issues pertaining to this are then dis- 
cussed. In this section, we also introduce a general 
method of fitting complex Bayesian models known as 
Markov chain Monte Carlo (MCMC), although de- 
tails of the algorithm that we used is deferred to the 
Appendix. Finally, results from applying the model to 
mapping the :elative abundance of mourning doves 


(Zenaida macroura) are given, and the chapter con- 
cludes with discussion of issues pertaining to exten- 


sion of the model. 


Mapping 


Mapping is inherently a problem of spatial prediction. 
That is, given a set of observations y = (yj, y2,..., 
Yn), we wish to predict the value of a new observation, 
say yo. Predictions on a grid of new locations produce 
a map. There are many techniques commonly used for 
mapping data. These include variations on “inverse- 
distance" interpolation, kernel smoothing, thin-plate 
splines, and kriging. Most of these are essentially 
“black-box” procedures that do not rely on specifica- 
tion of a statistical model for the underlying process. 
As such, quantification of the prediction uncertainty is 
generally difficult for most of these procedures. We 
briefly discuss kriging here because it is one of the 
more common techniques employed for mapping. 
Kriging can be viewed in a model-based context, but it 
is most often viewed as a *model-free" procedure. 

. » Sn) denote a set of sampling loca- 
< sy (ss) of 
some spatial variable are made. Kriging assumes that 


Eet (s1, Sg a9 
tions at which observations y(s4), y(s2), . . 


the data are a realization of a random spatial process, 
{Y(s) : s e DJ, where D is a fixed subset of R2 (i.e., D 
is some spatial region). Predictions are produced 
based on first and second moments (i.e., mean and co- 
variance) of the underlying random process. For clar- 
ity, we will present what is known as "simple kriging" 
here (Cressie 1991, 110), which assumes that the 
mean of the process is known. Many generalizations 
of kriging are possible, and the interested reader is re- 
ferred to Cressie (1991) for details. The traditional de- 
velopment of kriging specifies the mean and covari- 
ance structure as 


E[Y(s)] = 4 Var[Y(s)] = o? 
Corr(Y(s), Y(s')) = ke(|| s — s'l) 


Here, kg(|| s — s’||)is the correlation function, which 
depends on parameter 0 (possibly a vector) and the 
distance between locations s and s^ Thus, it is as- 
sumed that the correlation function is stationary and 
isotropic; in other words, it depends only on the dis- 
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tance between points and not on their absolute loca- 
tions. À common correlation model is the single pa- 
rameter exponential function given by 


koll s — s“) = e-ls- sW, 


Under this model, the correlation between two 
points decays exponentially as the distance between 
those points increases. The rate of decay is controlled 
by 0, which is loosely interpreted as a correlation 
range parameter (i.e., larger values of 0 lead to higher 
correlation between points located further apart). Al- 
though there are many other parametric models for 
the correlation function, including multiparameter 
models, there is often little theoretical justification for 
use of one over the other. Thus, given the concise de- 
scription of correlation provided by the exponential 
model, we feel this simple model to be a reasonable 
choice for many applications involving mapping. Oc- 
casionally, it makes sense to parameterize a nugget ef- 
fect in the covariance function, which amounts to a 
discontinuity at the origin of the covariance function 
and arises due to measurement error and uncorrelated 
*small-scale" variation in the process. 

Kriging seeks to predict the process Y at an unmon- 
itored location so, say y(so), using a linear function of 
the observed data. Thus, the kriging predictor is of the 
form 


$(o) = >) Abs; 


i-i 


where the A; are the kriging weights. The kriging 
weights are selected so as to minimize the mean- 
squared prediction error 


(56.1) 


subject to the predictor being unbiased. Under the as- 
sumption of constant mean, the requirement of unbi- 
asedness is equivalent to X A; = 1. The kriging weights 
are computed as the solution to minimizing Equation 
56.1 subject to the unbiased constraint. This is a sim- 
ple optimization problem solved by taking derivatives, 
setting them to zero, and solving the resulting linear 


system of equations. See Cressie (1991, 120) for 
details. 

In practice, parameters of the kriging model must 
be estimated. This is often done in an ad hoc fashion. 
Typically, mean parameters are estimated by general- 
ized least-squares, the correlation parameter is esti- 
mated by fitting the parametric model to *empirical" 
estimates computed using residuals from this fit, and 
the procedure is iterated. Likelihood estimation is ap- 
parently neglected due to the absence of a likelihood. 

Kriging leads to predictions that are optimal in the 
sense that they have minimum variance among all lin- 
ear, unbiased predictors. That is, the kriging predictor 
is the Best Linear Unbiased Predictor, or BLUP, regard- 
less of distributional assumptions beyond existence of 
the first two moments (hence the *model-free" inter- 
pretation). Although this is often seen as a benefit of 
kriging, we believe it to be a major deficiency, particu- 
larly when the random variable is clearly non-normal, 
such as with count data. In general, for non-normal 
data, there is little theory to suggest that the kriging 
predictor is reasonable. Indeed, for non-normal data, 
distributions are not completely specified by the first 
and second moments and therefore use of a kriging 
predictor is somewhat ambiguous because it is based 
on specification of first and second moments alone. 
Moreover, estimates of prediction variance computed 
under kriging procedures tend to underestimate true 
uncertainty in the prediction since error in estimation 
of covariance parameters is not accounted for. Recent 
developments (e.g., Handcock and Stein 1993; Diggle 
et al. 1998) toward Bayesian formulations of kriging 
alleviate this problem and are more naturally applied 
to non-normal data problems. 

Under distributional assumptions on the data, a 
predictor that is optimal in a stronger sense can be 
computed. The Best Unbiased Predictor (BUP) is de- 
fined to be the conditional expectation of the quantity 
to be predicted given the data, or in other words 
E[y(so) y] (in general, this conditional expectation fol- 
lows directly from specification of the joint distribu- 
tion of y(so) and y). For the special case when the spa- 
tial process has a normal distribution, it can be shown 
that the BUP is linear and therefore equivalent to the 
BLUP. Thus, the kriging predictor is equivalent to the 
BUP derived under a normality assumption; use of 
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kriging can therefore be viewed as an implicit assump- 
tion of normality. 

Count data are clearly non-normal. Because count 
data cannot be negative, one should use a predictor 
that guarantees positivity of the predictions. Addition- 
ally, count data typically exhibit strong mean-variance 
relationships that are ignored by kriging procedures. 
Although one can mitigate some of these problems by 
use of a transformation (e.g., log-normal kriging), 
there is little theoretical basis for this in the context of 
modeling counts. Therefore, we believe there is a need 
for a model-based mapping procedure for count data 
that produces accurate predictions of relative abun- 
dance and associated measures of prediction uncer- 
tainty while at the same time imposing realistic distri- 
butional assumptions on the data. 


Breeding Bird Survey Data 


The data used for illustration here consists of mourn- 
ing dove counts made on eighty-seven routes in Penn- 
sylvania during 1990 (Fig. 56.3, top panel). The grid 
of points on this map are those locations at which pre- 
dictions are desired in order to construct the map. We 
arbitrarily chose a 30x40 grid of points for prediction. 
For convenience, we associate the total count for 
each BBS route with the spatial location of the route 
midpoint. 

A major consideration in the analysis of BBS data is 
observer biases (Sauer et al. 1994). In the analysis of 
trends, observer effects are typically parameterized as 
nuisance effects and not modeled directly (e.g., Link 
and Sauer, 1997a). For mapping BBS data from a sin- 
gle year, a simple treatment of observer effects is pos- 
sible. Generally, each route count is made by a differ- 
ent observer, and it is reasonable to assume that the 
observer effects are independent across space. Our so- 
lution then, is to parameterize the observer variation 
as measurement error, which is discussed in the fol- 
lowing section. 

We expect BBS counts made at different routes to 
be spatially correlated for (at least) two reasons. First, 
correlation arises due to similarities (or differences) in 
underlying habitat structure. That is, all other things 
being equal, counts made on routes sharing similar 
habitat characteristics should be more similar than 


counts made on routes with dissimilar habitat charac- 
teristics. Second, even conditional on habitat struc- 
ture, there are broad-scale differences in abundance 
of a species within that species’ range that are likely 
to vary smoothly over space. The model that we pro- 
pose in the following section accommodates spatial 
correlation among counts in a very general fashion 
and can be modified slightly to model both sources 
individually. 


A Poisson Mixed Model for 
Mapping Bird Counts 


Let y(s) be the count made at some route centered at 
location s. We assume that y(s) has a Poisson distribu- 
tion with mean A(s), which is denoted as 


y(s) | A(s) ~ Poisson (As) 


The “|à (s)"notation merely makes explicit that 
this model for y(s) is conditional on A(s). Though gen- 
erally ignored, this notation is more rigorous for our 
subsequent treatment of A(s) as itself being a random 
variable, as we discuss shortly. Thus, two observa- 
tions, say y(s) and y(s^), are independent conditional 
on their means A(s) and A(s^); that is, any association 
in the counts at two sites is in their expected values. 
The Poisson variation in counts—that due to chance 
events in the observation process—is assumed to be 
independent from site to site. As is natural for Poisson 
data, we then specify a model for the logarithm of the 
mean of this Poisson variable as 


log (A(s)) = u + Z(s) + n(s) 


Here, u is a constant and Z(s) and ņ(s) are random 
effects, the meanings of which will be described 
shortly. Generally, m could be replaced with an arbi- 
trary regression function such as u(s) = D Bjx;(s) 
where the x,(s)are spatially indexed covariates, such as 
habitat, and B; determine the change in abundance (on 
the log scale) per unit change in covariate j. This 
model is more generally useful, but the added general- 
ity is not necessary in our development. To reduce 
clutter in formulae given below, we will define u(s) = 
log (A(s)). 

The random effects in this model consist of a 
smooth, spatially correlated *map," Z(s), and an un- 


56. Statistical Mapping of Count Survey Data 629 


correlated *measurement error" term, due to observer 
differences, n(s). More specifically, we assume that 
Z(s) ~ N(0, o2) with Corr(Z(s), Z(s’)) = ke(| s — s'|]), 
and n(s) ~ N(0, 02) with Corr(n(s), n(s’)) = 0 for s z s'. 
Thus, the correlation between the route effects at any 
two sites, say s and s', is given by the correlation func- 
tion ke(| s — s'|). We used the exponential function 
discussed in the section on mapping, so kg(|| s — s’||) = 
e 5 - l8, Other correlation functions could be con- 
sidered (see Cressie 1991 for other choices). The fact 
that the route effects are correlated allows for predic- 
tion of Z(s) (and A(s)) at arbitrary, unsampled, loca- 
tions. 

Variation that is not explicitly accounted for in the 
mean term is absorbed into the two random effects, 
one being a surrogate for spatial sources of variation 
that have been neglected (such as omitted habitat co- 
variates) and that may be of interest (for example, in 
prediction), the latter representing unpredictable (i.e., 
observer, protocol, etc.) variation. In other words, the 
set of n(s) parameters are merely nuisance parameters 
not of direct interest for any inferential problem. The 
interpretation of Z(s) depends on whether p contains 
habitat covariates. The variance components 62 and 
9; 
process and the observer differences, respectively. 


are the variances attributable to the smooth spatial 


Thus, when p contains habitat covariates, we would 
expect s to be smaller than when habitat information 
was omitted from p. 

Since the (s) terms are assumed to be independent 
nuisance parameters, it is clear that, conditional on p 
and Z(s), log(A(s)) has a normal distribution: 


log(A(s))lu, Z(s) ~ (u + Z(s), oq) 


This is convenient for employing the estimation al- 
gorithm discussed later in this chapter. 

To simplify presentation of estimation and predic- 
tion procedures given below, we will make use of the 
notation [e] to represent the “distribution of e." The 
symbol “|” means conditional upon and is read as 
“siven.” For clarity, we will neglect the convention of 
indexing quantities by s and instead use the more 
compact notation of subscripting the observations and 
corresponding random effects by i; in other words, 
y(s;) = y; Thus, we will denote the N x 1 vector of ob- 
served BBS counts (on the N BBS routes) as y = (yz,y2, 


. yn). Similarly, the corresponding Poisson mean 
vector, the vector of log-means, and the random ef- 
fects are the N x 1 vectors A, u = log(A) , and Z, re- 
spectively. 

The joint likelihood of all observations for our 
problem is then 


N 
[y] = [ [Poisson (A,) (56.2) 


t=] 
in other words, the product of N Poisson likelihoods. 
The joint distributions of u = log(A) and Z are multi- 
variate normal distributions: 


[ulp,Z, o2] = MVN(p1 + Z, 02) (56.3) 


where 1 is an N x 1 vector of ones, and I is the NxN 
identity matrix, and 


[216,52] = MVN(0, o2Ko) (56.4) 


where Kg = Corr (Z,Z). 
Note that the Z-process does not depend on sand 
in Equation 56.4. 


Sjj 


so 02 does not appear after the 
Similarly, conditional on Z, u in Equation 56.3 is in- 
dependent of o2 and 9. 

Our goals under this model are estimation of the 
parameters, p, 0, 02, o7 and perhaps the vector of 
route effects at the data locations, Z. Of more inter- 
est is prediction of values of Z(s) on a grid, say Z, (a 
vector), and estimating appropriate variances for 
these predictions. The vector of predictions of Z,, 
plotted with respect to their spatial attribution, 
forms a map depicting spatial variation in relative 
abundance, albeit on the log scale. We may also be 
interested in the expected count at location s; in 
other words, E[A(s)] = Efe! + Z's) + n9], which may be 
more meaningful for interpretation. One implication 
of the Poisson model (and, indeed, for most non-nor- 
mal models), is that closed-form expressions for pa- 
rameter estimates and predictions are not obtainable. 
Instead, solutions must be obtained numerically, or 
by simulation, which is the approach that we take in 
the following section. 

Finally, although we have presented the model in 
the absence of habitat covariate information, we 
would generally recommend that important habitat 
attributes be conditioned on (that is, incorporated 
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into the p term as discussed above) where they are 
available. 


Model Fitting and Prediction by 
Markov Chain Monte Carlo 


The Poisson mixed model described above is a special 
case of a generalized linear mixed model—that is, a 
model that contains both fixed and random terms. 
Under such models, traditional estimation is difficult. 
Since the random effects in the model are essentially 
high-dimensional nuisance parameters, the classical 
approach is to integrate them from the conditional 
likelihood, producing a simplified unconditional likeli- 
hood that is only a function of parameters u, 0, 62 and 
67 . The resulting -fold integration problem is analyt- 
ically intractable and must be tackled using numerical 
or simulation-based integration. We do this below, 
though in a more formal Bayesian framework. Strictly 
speaking, there are no fixed effects in a Bayesian 
analysis. Instead, we use this term to distinguish be- 
tween those parameters in the model that are given 
flat priors and those that are related through a prior 
distribution. Traditional prediction is equally difficult, 
requiring the numerical calculation of conditional ex- 
pectations such as E[Z(s)ly]. In a Bayesian setting, this 
quantity is just the posterior mean of Z(s) and thus is 
simple to compute using techniques more common in 
Bayesian analysis, described below. We adopt several 
Markov chain Monte Carlo (MCMC) techniques for 
sampling from the relevant posterior distribution. Be- 
fore discussing the MCMC algorithm, we provide a 
Bayesian formulation of our mapping problem. 


Bayesian Formulation of the Mapping Problem 


The heart of the Poisson mixed model is specified by 
the three probability distributions (Eqn. 56.2—56.4). 
lo complete the model specification, prior distribu- 
tions are required for the parameters. We assume that 
the parameters are independent of one another and 
express the joint prior distribution as the product of 
the prior distributions for each parameter: [u, 0, 67, 
02] = [u]I][o2][02]. In our analysis, uniform priors 
were assigned to y and 0 so as to reflect little prior in- 
formation about them. Specifically, [u] was taken to 
be constant (a so-called improper uniform prior) and 


[0] was taken to be uniform on [0,1000], a proper uni- 
form prior. The implication of choosing a uniform 
prior for a parameter is that the resulting estimates are 
essentially maximum likelihood estimates. In the case 
of the uniform prior on 9, there is some ambiguity in 
restricting the range to be on [0,1000] since, strictly 
speaking, any positive value is possible. This was a 
choice made more for pragmatic reasons having to do 
with the simulation-based estimation of parameters. 
We felt this range of values to be a reasonable range of 
probable values for this parameter. Indeed, the esti- 
mates were well below the 1,000 upper limit. See Ap- 
pendix for further discussion. We parameterized the 
variance components in terms of precision: T = l/oz 
and t, = 1/02. Diffuse gamma priors were assigned to 
both of these precision parameters: T= Gamma(ay;b,) 
and t, = Gamma(a,,b,). Parameters of prior distribu- 
tions are known as hyperparameters. In our case the 
hyperparameters are dy, by, a,, and b,. Often, their 
values are fixed in the analysis, for example at values 
that indicate virtually no prior knowledge of the value 
of t, and t,. Discussion of this can be found in the 
Appendix. 

Bayesian analysis focuses on analysis of appropri- 
ate posterior distributions—that is, the conditional 
distribution of the unknown quantities (e.g., parame- 
ters, predictands) given the data. Typically, one is in- 
terested in marginal posterior distributions of a partic- 
ular unknown. For example, the marginal posterior of 
the random effect at a location, say Z(s), is simply the 
conditional distribution [Z(s)ly]. Generally, it is this 
distribution upon which one would conduct inference 
regarding Z(s). For example, the Bayes estimate is the 
mean of this posterior distribution, the posterior vari- 
ance quantifies uncertainty, and so forth. 

Unfortunately, in our problem, these posterior dis- 
tributions are analytically intractable in the sense that 
they are not of a “known,” convenient form. How- 
ever, to within a normalizing constant, the joint poste- 
rior distribution is merely the product of the likeli- 
hood (Poisson) and prior distributions (multivariate 
normals, gammas, and uniforms). For our spatial 
model of BBS data, the joint posterior (of all un- 
knowns in the model) is 


[4,Z,1,0,t,trly] ~ [ylu][ulu, Z, tn [ZlO,t][u,0, tnt] (56.5) 
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Although this distribution cannot be analyzed directly, 
simple algorithms exist that allow us to draw samples 
from distributions that are specified up to a normaliz- 
ing constant. Our general strategy for analysis of this 
model then, is to draw a large number of samples from 
the joint posterior distribution given by Equation 56.5 
and then to estimate the quantities of interest from the 
simulated values. For example, to estimate u, we sam- 
ple a large number of values from Equation 56.5, iso- 
late those simulated values of u, say (1), n2, . . . , and 
then use these values to estimate interesting features of 
the marginal posterior of p (e.g., the mean and vari- 
ance). The general method for accomplishing this sim- 
ulation task is known as Markov chain Monte Carlo; 
we describe this in the following section. Since our 
Poisson model is similar to that of Diggle et al. (1998), 
the interested reader is referred to their paper for fur- 
ther details on MCMC-based estimation. 


Markov Chain Monte Carlo 


Markov chain Monte Carlo is a collection of methods 
for simulating from complicated multivariate distribu- 
tions. The idea behind MCMC is to simulate a random 
walk through the parameter space that converges to the 
target distribution (that is, the distribution of the sam- 
ples converges to the target distribution). Formally, we 
simulate the sequence 0*, t = 1, 2... by starting at some 
point 6° and then, for each ż, drawing 6 from a transi- 
tion distribution, say P,(0:10' - 1) that depends on the 
previous value and that may vary from one iteration to 
the next, hence the subscript t on P. The key is con- 
struction of the transition probability distributions so 
that the Markov chain converges to a stationary distri- 
bution that is the desired target distribution. Upon con- 
vergence of the chain, say at iteration T, subsequent 
samples 0T * 1, 907 +2,.... 
from the appropriate distribution and relevant quanti- 
ties of interest (e.g., means, variances) may be com- 


are correlated observations 


puted from them. In the present context, we use 
MOMC to generate samples from the posterior distri- 
bution in Equation 56.5. Although MCMC is relatively 
*new" in statistical science, the literature on MCMC 
techniques is vast, and even a simple presentation of the 
details is beyond the scope of this chapter. The inter- 
ested reader is referred to Gelman et al. (1995) and 
Gilks et al. (1996) for recent expositions on the subject. 


In our application, we use Gibbs sampling (Gelfand et 
al. 1990; Zeger and Karim 1991; Casella and George 
1992) and the Metropolis-Hastings algorithm (Chib 
and Greenberg 1995). Detailed discussion of both 
methods, in addition to background and other related 
material, can also be found in the texts by Gelman et al. 
(1995) and Gilks et al. (1996). In Gibbs sampling, the 
transition distributions are chosen to be the full condi- 
tional distributions of each parameter or vector of pa- 
rameters. The full conditional distribution of a parame- 
ter is defined as the conditional distribution of that 
parameter given all other quantities in the model. These 
are typically easy to construct and are nominally pro- 
portional to the product of the likelihood, or the com- 
ponent model in which the parameter appears, and the 
prior distribution for that parameter. The key advan- 
tage of Gibbs sampling is that the full conditional dis- 
tributions are typically of simple form and can often be 
sampled directly. However, sometimes they too are only 
known up to a normalizing constant. In this case, one 
must rely on other methods to sample from them. A 
common method, and that which we used where neces- 
sary, is the Metropolis-Hastings algorithm, which re- 
quires sampling from an approximating distribution. 

In theory, samples from the full conditional distri- 
butions produce a Markov chain whose stationary 
distribution is the target posterior. Posterior quantities 
of interest are then computed from the resulting simu- 
lated data after convergence is judged to have oc- 
curred. There are many issues that one must address 
in applying MCMC algorithms, including selecting 
approximating distributions for the Metropolis- 
Hastings algorithm, assessing convergence of the 
chain, and so forth. For details, interested reader may 
consult Gilks et al. (1996). 

We present our MCMC algorithm in the Appendix. 
This is of sufficient detail so as to permit the interested 
reader to fit our model, but technical details as to how 
the algorithm is derived are omitted. The interested 
reader is referred to Gilks (1996) for discussion of this. 


Results 


In our analysis, the Markov chain was run for fifty 
thousand iterations at which point convergence was 
judged to have occurred. We then ran the chain for 
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another ten thousand iterations, sampling Z, every fifth 
iteration in order to reduce serial correlation in the sim- 
ulated values. Estimates of parameters other than Z, 
were based on all ten thousand *converged? samples. 

To illustrate what *output" from an MCMC-based 
analysis looks like, the simulated values for the struc- 
tural parameters of the model, u, 0, t,, and ty, are 
shown in Figure 56.1. Each point depicted on these 
plots (which are connected by lines) represents a single 
simulated value of the relevant parameter (and thus 
there are ten thousand points). The corresponding 
posterior distribution estimates, formed by computing 
a histogram of the simulated values, are shown in Fig- 
ure 56.2. The simulation history plots are indicative of 
fairly rapid convergence, the values fluctuating within 
a narrow range of their mean, with no slowly varying 
departures or rapid shifts in the mean. Parameter esti- 
mates were computed as the means of the marginal 
posterior distributions, and error estimates from the 
standard deviations of the marginal posteriors. Poste- 
rior means and standard deviations of u, 0, t,, and Tr 
were fi = 2.96 (0.0605), 6 = 310.2 (169.6), io = WO 
(0.414), and €, = 4.47 (1.413). Notably, the estimate 
of 0 is very imprecise. This is not surprising given the 
relatively small sample size (N = 87) for a spatial sta- 
tistical problem. 


Inference, Prediction Uncertainty, 
and Sampling Design 


Inference within the context of a Bayesian model is al- 
ways relative to the posterior distribution of the quan- 
tity of interest. For example, if one were interested in 
inference concerning u in our model, then one can 
construct a 95 percent credible interval (a Bayesian 
"confidence interval") from the simulated values by 
locating the 2.5 and 97.5 percentiles. The resulting in- 
terval is (2.85, 3.094). Of course, a point estimate is 
the posterior mean (2.96), and an assessment of varia- 
tion is the posterior variance. Other interesting quan- 
tities are easily estimated from the simulated values, 
for example, the median, mode, and arbitrary per- 
centiles. For a model containing habitat covariates, 
one would likely be interested in carrying out infer- 
ence regarding the effect of those habitat covariates. 
The general strategy is the same when interest cen- 
ters on assessment of prediction uncertainty. In this 


case, the marginal posterior distribution of each point 
at which a prediction is desired can be used to obtain 
a measure of uncertainty of the prediction at that 
point. We illustrate this while at the same time show- 
ing the implication of spatial correlation in the model. 
Figure 56.3 (top) shows the locations of two grid 
points; one, labeled *B" is located relatively far from 
sample locations, and thus we expect less-accurate 
prediction of its route effect, Z(sg). The other, labeled 
“G” is located very near several sample locations, and 
so we expect more accurate prediction of its route ef- 
fect, Z(sc). Figure 56.3 (bottom) shows the estimated 
marginal posterior distributions for these two route 
effects. The posterior for the first point is noticeably 
more diffuse than that of the latter. The posterior stan- 
dard deviation of point B was 0.375, whereas the pos- 
terior standard deviation of point G was 0.258. One 
could also obtain credible intervals for these predic- 
tions, and other interesting quantities, from the values 
simulated from the posterior distribution. Ecologists 
may be interested in other types of inference problems. 
For example, comparison of averages over geographic 
strata formed by watershed boundaries, land-use con- 
siderations, and the like. Such comparisons can be 
made by examining the posterior distribution of the 
quantity of interest (an areal average). 

Of course, our ultimate aim with this work was to 
produce a map depicting the relative abundance of 
birds over space while at the same time qualifying that 
map with an assessment of uncertainty. A map of the 
predicted route effects (the marginal posterior means) 
on the 1,200 grid points is shown in Figure 56.4 (top) 
and the corresponding prediction standard error map 
(the marginal posterior standard deviations) is shown 
in Figure 56.4 (bottom) (see color section). Maps of 
the expected route count predictions, A(s) = e") and 
associated standard errors are shown in Figure 56.5 
(see color section). Of note here is that the standard er- 
rors of the expected count exhibit a strong relationship 
with the mean, as is expected under a Poisson model. 
Under many traditional methods of mapping data such 
as these, assessment of uncertainty is difficult, and 
often ad hoc at best. We feel that the ability to do this 
within a rigorous statistical framework is the primary 
benefit of the Poisson model that we have proposed. 

Finally, the model also provides a framework for 
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Figure 56.1. Markov Chain Monte Carlo simulation histories of model parameters. 


evaluating issues of spatial sample allocation, allowing gions of relatively low prediction error. Intuitively, 
for sampling design in order to better estimate abun- such information can be used in redesign of the sam- 
dance at specific locations or over regions of particular pling plan by moving clustered routes into regions of 
interest. For example, in Figure 56.4, we observe re- high prediction error. More formally, one can use the 


gions of relative high prediction error, and other re- estimated covariance structure of the spatial process 
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Figure 56.2. Estimated marginal posterior distributions of model parameters. 


to address design issues. For example, identification of 
that subset of routes which, when deleted, or moved, 
produces the best change in average prediction vari- 
ance of the route effects. This is a relatively straight- 
forward optimization exercise. Such problems are well 


studied in the context of air pollution monitoring net- 
works (see Cox et al. [1995] and Nychka and Saltz- 
man [1998] for reviews), yet this work has not been 
adopted in the development of sampling plans for eco- 
logical processes, to the best of our knowledge. 
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Figure 56.3. Top: BBS route locations and prediction grid, indicating a well (point G) and poorly (point B) predicted grid point. 
Bottom: Posterior distributions of predication at points labeled G and B. 


Discussion and Further Investigations 


The Poisson model presented here represents an ad- 
vance over inverse-distance and kriging-based ap- 
proaches to spatial summary of count data in that an 
appropriate statistical model is fit to the data using re- 
alistic distributional assumptions. Using this model, 
estimates with appropriate standard errors can be 


used to evaluate questions about spatial differences in 
relative abundance, inference regarding model param- 
eters and predictions, and sampling design issues. Fit- 
ting and prediction under the Poisson model was 
based on Markov chain Monte Carlo methods. These 
methods, while computationally intensive, provide a 


powerful tool for fitting complicated models, and we 
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expect that this approach will have great applicability 
for a variety of ecological modeling exercises. 

The final model (Figs. 56.4 and 56.5) provides a 
reasonable spatial smooth of mourning dove relative 
abundance in Pennsylvania. Spatial patterns exist in 
Pennsylvania mourning dove populations, with low- 
est abundances in the north-central part of the state, 
and higher abundances in the southeast and west- 
central parts of the state. These patterns appear to be 
closely associated with the distribution of forest 
cover in the state (see Fig. 56.6 in color section), 
with high dove relative abundance being associated 
with low forest cover, although we have not yet ad- 
dressed these issues formally within the context of 
our model. These patterns raise an important ques- 
tion. Namely, what is the genesis of spatial correla- 
tion in BBS data? Clearly, habitat plays a major role 
and we feel that models that address mapping and 
other prediction problems should accommodate 
habitat information when possible. The Poisson 
model is easily extended to include information on 
habitat covariates. The large variety of remotely 
sensed databases on habitat (e.g., Jones et al. 1997 
for Pennsylvania) greatly facilitate these sorts of 
analyses, and we are presently working to include 
habitat variables from remotely sensed data in our 
analyses of BBS abundances. However, since habitat 
structure is generally sampled infrequently in time, is 
measured with error, and is often ambiguously de- 
fined, it leads to many interesting and complex sta- 
tistical problems that we hope to address in exten- 
sions of this work. Nevertheless, one could 
potentially condition on the “right” set of habitat 
variables, thereby diminishing the spatial correlation 
remaining to be modeled. But, since spatial structure 
from other sources may exist (such as general pat- 
terns in abundance within a species’ range), condi- 
tioning on habitat might not entirely account for all 
of the spatial variation that is present. We also com- 
puted predictions under a model lacking observer 
variation. The impact of neglecting observer varia- 
tion was obvious; “rougher” maps with more local 
maxima and minima were produced, since the model 
did not have the ability to smooth unusually small 


and large observations. Although these results were 
interesting insofar that they lend support for the 
need to accommodate observer variation, space does 
not permit their presentation here. 

Of course, the BBS is an ongoing survey, and data 
exist for over thirty years of surveys on BBS routes in 
Pennsylvania. The temporal structure of the BBS can 
also be included in a Poisson model, although incor- 
porating the limitations of the BBS design poses inter- 
esting problems in most analyses of population change 
(e.g., Link and Sauer 1998). For example, it has been 
documented that temporal patterns exist in observer 
quality, and hence this covariate must be accommo- 
dated in analyses of temporal change in counts. Any 
analysis of count survey data must accommodate fac- 
tors that influence detectability of birds among BBS 
routes, and simple analyses that do not include these 
factors can lead to biased estimates of spatial and tem- 
poral change in populations (e.g., Link and Sauer 
1997b). 

Finally, this general modeling approach extends 
easily to other non-normal data. For example, in map- 
ping the range of a bird species using presence/absence 
(or atlas) data, one might assume that data follow a 
Bernoulli distribution where the probability of occur- 
rence is spatially correlated. The same idea can be ap- 
plied to spatially indexed binomial counts, as Diggle et 
al. (1998) illustrated. Presence/absence mapping may 
also be handled using the autologistic model (as in 
Klute, Chapter 27), but the latter requires that the 
model be defined with respect to a lattice, so care must 
be taken when dealing with data collected in continu- 
ous space. 
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Appendix: The MCMC Algorithm 


In the following description of the MCMC algorithm 
that we employed, the notation [xl e] denotes the full 
conditional of the parameter x, and superscripts index 
the iteration; for example, 0 is the value of at itera- 
tion t. Before beginning the MCMC simulation, initial 
values must be chosen for all parameters. There is not 
a general rule of thumb for choosing initial values; 
however estimates based on traditional methods per- 
form well when they can be computed. For example, 
initial values for Z might be computed by smoothing 
the log-transformed data whereas the initial value for p 
can be based on a GLM fit. The reader should keep in 
mind that this algorithm generates a sequence of sam- 
ples from the joint posterior, say, ((u?, Z(?, yw, . . . ) : 
t-1,2,..., MJ. These samples can then be used to es- 
timate important features of the posterior distribution, 
as illustrated in Discussion and Further Readings. 

Each iteration of the MCMC algorithm consists of 
the following six steps (comments and explanation 
follow): 


Step (1) For i= 1, 2, .. . N sample from [u,l °] = [y;l 


uj\[u;| u, Zi, Ty] as follows: 


(1a) Generate uj ~ N (u;(-), õi) and compute the 
ratio 


*3rL *I (£1 -1 -1 

Es SAZA Zr re ] 

B zi =E -1 =i cil 
Dd Mi Ip Zum t] 


(1b) Set uf? = u; with probability min(r, 1 ) 


(1c) Otherwise, set u? = uf” 
Step (2) Sample Z directly from the full conditional: 
Tile MVYN((ch 1 * TK y! e(t (utr Sey, 
fr euler IK E 


Step (3) Sample 0 from [0 | e] œ~ [Z| 0, t,] as fol- 


lows: 


(3a) Generate 6” ~ DU(0,1000) and compute the 
ratio 
mz ee 
di (Ze), ce 


(3b) Set 0? = 0* with probability min(r,1) 
(3c) Otherwise, set 6 = 06-1) 


Step (4) Sample u® directly from its full conditional 
distribution: 
il 
Now 


where #( = (1/N) Xu" is the mean of the u; values 


pi Normal", 


generated in step (1) above. 


Step (5) Sample 1, directly from its full conditional 


isle» Gaming a, « (N12) i 4 ) 
(1/b,)+(1/2)Z'K”Z 


Step (6) Sample t, directly from its full conditional 
[c.l e] = 


1 
+(N /2),————___—_______ 
Gam a | E MS 


Comments 


The matrix K depends explicitly on 6, though this de- 
pendence has been suppressed. As a consequence, as 0 
is updated, so must K be. The notation DU(0,1000) 
indicates that candidate values of 0 were drawn from 
a discrete uniform distribution on [0,1000]. Because 
simulating from the full conditional of 0 requires com- 
puting both the determinant of K and its inverse, and 
since K depends on 0, performing this computation is 
very computationally demanding. Therefore, the de- 
terminants and inverses were computed for 100 values 
on the interval [0,1000] and stored for subsequent 
use, thereby avoiding on-the-fly computation at every 
iteration of the Gibbs sampler. Although this induces 
some loss of precision in estimating 0, we do not feel 
this to be of significant enough concern to warrant a 
more precise candidate distribution, such as continu- 
ous uniform on the same interval. 

The mean and variance of a Gamma(a,b) distribu- 
tion are ab and ab?, respectively. In our analysis, we 
set a; = a, =1/10 and b, = b, =10 so that the priors on 
t; and Ty both had mean 1 and variance 10, which are 
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highly diffuse priors for the variance components, ex- 
pressing little prior information about them. 

The parameters 61 and 82 of Steps (1a) and (4) are 
merely tuning parameters of the MCMC algorithm; 
that is, they have no effect on the estimates obtained, 
but only on the manner in which they are obtained. 
Large values ensure that the parameter space is ex- 
plored more rapidly, at the expense of rejecting a 
higher proportion of the draws. Conversely, small val- 
ues lead to less rapid movement about the parameter 
space but a higher acceptance rate of the draws. Of 
course, “large” and “small” are relative to the distri- 
bution of mass in the posterior, and for our analysis 
61 = 1 and 87 = 1/8 appeared to provide a reasonable 
compromise between exploration of the parameter 
space and acceptance rate. 

To compute the prediction of Z,, the above algo- 
rithm is run until convergence at which point (follow- 
ing Diggle et al. [1998]) we add the following step to 
the algorithm: 


Step (7) Sample z directly from the full conditional: 
[Z,12,0,02] 2 MVN(SK?Z,S, -$,K"'S,,) 

where and $, = Cov *(Z,, Z) and $; = Var(Z,). Also, it 

may be desirable to estimate the vector Àg, the Poisson 


mean at the grid of prediction points. To do this, we 
include the following step into the algorithm: 


Step (8) Sample u® directly from: 


1 
[u4]u, Z,0,02] = eva? + zd) 
7 


These simulated values are then exponentiated 
to obtain the sample of A,. This simulation is then 
run until enough samples have been drawn to achieve 
precise estimates of the model parameters and 
predictions. 

Sampling from the full conditional in Step (7) is, 
in general, the computationally limiting stage of the 
MCMC simulation since it requires manipulating 
the multivariate normal variance-covariance matrix, 
which is of dimension N, x N, where Ng is the 
number of grid points at which predictions are de- 
sired (in our Mourning Dove illustration, Ng = 
1,200). Since successive iterates of the Markov chain 
tend to be highly correlated, and since Step (7) is 
computationally expensive, it is recommended that 
this step be performed only every few iterations. We 
sampled Z, every Sth iteration, as in Diggle et al. 
(1998), but for problems such as ours in which N is 
relatively small compared with N,, less frequent 
sampling may be more efficient. As a consequence, 
draws of Z, will be much less correlated, and fewer 
samples will be required to produce an adequate 
estimate. 
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Influence of Selected Environmental Variables 
on GIS-habitat Models Used for Gap Analysis 


Carlos Gonzalez-Rebeles, Bruce C. Tbompson, and Fred C. Bryant 


ap analysis is a geographic approach to assessing 

biodiversity distribution and its conservation sta- 
tus based on a series of digital information layers that 
are combined and analyzed in a geographic informa- 
tion system (GIS). The analysis requires the following 
data sets: (1) present distribution of land cover (pro- 
duced from the classification of satellite imagery to- 
gether with ancillary information), (2) distributions 
for vertebrate species (predicted from knowledge of 
their habitat associations and the spatial representa- 
tion of those habitats), and (3) maps of land steward- 
ship (to differentiate management status relative to 
conservation potential). Within the GIS, distribution 
of plant and animal species are analyzed relative to lo- 
cation of areas devoted to conservation. Sites with se- 
lected species or significant vegetation communities 
not adequately represented by current conservation 
systems will constitute “gaps” requiring priority atten- 
tion. A detailed description of gap analysis concepts 
and specific procedures are presented in Scott et al. 
(1993) and Gap Analysis Program (1998). 

Modeling wildlife species distributions is a key ele- 
ment in the gap analysis process and a variety of other 
conservation endeavors (Morrison et al. 1998). The 
definition of priority areas and their actual mapping is 
based on the combination of all individual species dis- 
tribution estimates for a project area (Butterfield et al. 
1994). Wildlife distribution predictions, within gap 


analysis context, are mainly based on two types of in- 
formation: location data (geographic unit or point 
data) and species associations with land cover (habitat 
indicator) (Butterfield et al. 1994; Csuti and Crist 
1998). Within the GIS, species are identified as being 
present in unique polygons where the geographic loca- 
tion characteristics and the preferred land cover over- 
lap. The process selectively filters out unsuitable habi- 
tat from the original coarse distribution and adds 
other potential habitat sites by extrapolating known 
distributions to the boundaries set by suitable land- 
cover associations (Scott et al. 1993; Scott and Jen- 
nings 1998). 

However, these types of “basic” models tend to 
overestimate distributions (Stoms and Estes 1993; 
Stoms 1994). Thus, different environmental variables 
based on specific habitat characteristics (e.g., soil, ele- 
vation, temperature, slope) also are included with the 
basic model as “filters” to provide further detail for 
the distribution estimates. It is expected that the addi- 
tion of each variable will result in a cumulative restric- 
tion in the spatial distribution of wildlife species such 
that an *adjustment" is made to the model (Butterfield 
et al. 1994; Csuti and Crist 1998). It is assumed that a 
different, more simplified distribution model would 
produce a different but less-accurate estimate if fewer 
or different variables were used. Notwithstanding, it is 
impossible to know a priori how each variable will be 
spatially represented in the habitat map or about its 
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relationships with other variables. Some of these vari- 
ables may be spatially correlated and may not actually 
provide additional information. 

Our research addressed questions related to the ef- 
ficacy of using these filter variables in the model (i.e., 
if their use produced a difference, and the level of 
this influence was determined). Our work was based 
on the premise that GIS-habitat models used for gap 
analysis are structured from a basic model of habitat 
associations represented by land cover and location 
data (base variables) and that additional sets of dif- 
ferent environmental variables (filter variables) are 
used to adjust the distribution estimate. For the 
analysis, we used New Mexico Gap Analysis Project 
(NMGAP) predictive models for animal distribution. 
Objectives attempted for this research were (1) to 
quantify the level of response (spatial changes) re- 
flected by altered distribution estimates to different 
levels of model perturbation (systematic reduction of 
filter variables) as an indirect measure of the value of 
the information contributed by including different 
combinations of filter variables in the distribution 
models, and (2) to examine the response patterns rel- 
ative to differences in model types (number of filter 
variables used). 


Methods 


Data to be used in the analysis were obtained from 
preliminary NMGAP vertebrate distribution models. 
They integrated a species-habitat relationship data- 
base for a total of 584 species (26 amphibians, 96 rep- 
tiles, 324 birds, and 138 mammals); see NMGAP 
Final Report (Thompson et al. 1996). By the time the 
study was developed, distribution models were defined 
from nine major habitat association categories. Base 
variables always included in all models were (1) coun- 
ties where the species is known to occur, (2) presence 
by county within watershed limits, and (3) land-cover 
types with which species are associated. Depending on 
the species, different combinations were applied for 
the remaining six filter variables: (1) soil associations, 
(2) elevation limits (minimum and maximum), (3) as- 
sociation with aquatic features, (4) slope affinities, (5) 
temperature limits, and (6) distribution by mountain 
ranges. Number and type of these filter variables could 


coincide or differ among species, but specific descrip- 
tions of habitat elements represented by variable cate- 
gories were unique for each of the species. 

Our sample was representative of the different 
types of NMGAP distribution models based on how 
they were structured by the number of filter variables 
included and not specifically directed to represent any 
taxa in particular. We did not consider special model- 
ing cases defined by an aquatic variable (i.e., amphib- 
ians, marsh birds, shore birds, and waterfowl models) 
or preliminary models formed only by base variables. 
From the remaining 423 models, we selected a strati- 
fied random sample to represent three basic model 
type groups. The groups were composed of base vari- 
ables and either one, two, or three filter variables. An 
additional restriction was applied to proportionally 
represent models for species with widespread and re- 
stricted distributions (distinction defined in spatial ex- 
tent relative to the size of New Mexico: less than one- 
fourth or more than one-fourth up to approximately 
one-half of state's surface). 

The final sample contained a total of thirty-one 
habitat models representing different species of mam- 
mals, birds, and reptiles subdivided into three groups: 
Group 1 had twelve distribution models with only one 
filter variable, Group 2 had twelve distribution models 
with sets of two filter variables, and Group 3 had 
seven distribution models with sets of three filter 
variables. 

Data were generated following NMGAP modeling 
procedures (see Gonzalez-Rebeles 1996 and Thomp- 
son et al. 1996). Models were perturbed by systemati- 
cally removing filter variables one at a time and also 
by combined sets (two or three at a time) to complete 
all possible permutations depending on the number of 
filter variables contained by model type (Groups 1, 2, 
and 3). By nature of the modeling process, changes 
produced from removing variables were expected to 
be in one direction, progressively expanding the distri- 
bution estimates (i.e., loss of detail). Thus, size of the 
response measured was considered indirectly to be an 
indicator of the amount of information, the adjust- 
ment value, provided by the variables tested (re- 
moved). Every time a filter variable or set of filter vari- 
ables was removed from the model, other filter 
variables in the model (if present) were maintained. 
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Figure 57.1. Distribution estimate for western banded gecko 
(Coleonyx variegatus) using the full set of model variables 
(Source: New Mexico Gap Analysis Project, NM-CFWRU). 


Base variables were always maintained. Response 
measured from altered distribution estimates was the 
percent difference in total area (square kilometers) of 
species distribution covered by the estimate as com- 
pared to original distributions with the full comple- 
ment of filter variables plus base variables (Figs. 57.1 
and 57.2). 

Response to model perturbation was examined 
across all thirty-one models by types of filter variables 
removed (individually or combined sets). For these ini- 
tial tests, the effect of removing filter variables was 
evaluated directly without considering a potential in- 
fluence by the presence or absence of other filter vari- 
ables in the model. 

To quantify the level of information contributed 
by filter variables, we examined the degree of change 
produced from reduction of the individual filter vari- 
ables with the least effect. Statistical significance was 
defined by comparing to a constant value defined as 
a 5 percent threshold of change. The magnitude 
for this threshold value was an arbitrary value 
chosen to set a minimum limit beyond which we 
considered that a filter variable contributed substan- 
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Figure 57.2. Distribution estimate for western banded gecko 
(Coleonyx variegatus) after removing elevation (filter variable) 
from the model (Source: New Mexico Gap Analysis Project, NM- 
CFWRU). 


tive information (i.e., a meaningful degree of model 
adjustment). 

Finally, the effect by type of filter variable removed 
(individually), relative to the presence or absence of 
other filter variables within the same model, was also 
evaluated. For this test, the responses from initial 
model perturbations were used to selectively subtract 
the effect of every other filter variable from a particu- 
lar combination. This was done to isolate their indi- 
vidual effects when removed from the model and to 
estimate variations in their effect relative to a potential 
presence or absence of the other filter variables when 
combined in the model (depending on model types: 
Groups 1, 2, or 3). Three kinds of potential combina- 
tion cases were evaluated: (1) removal of one filter 
variable given no other is present (R[X]), (2) removal 
of one filter variable given another is present (R[XIY] 
and R[YIX], and (3) removal of one filter variable 
given two others are present (R[XIYZ], R[YIXZ], 
R[ZIYX]). In all cases, base variables were present in 
the models. For a complete description of methods fol- 
lowed for the whole analysis, see Gonzalez-Rebeles 


(1996). 
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Figure 57.3. Range in the response (percent area increase) across vertebrate-distribution models when different filter vari- 
ables were removed (E = elevation, S = soil, T = temperature, and M = mountain range). Boxes represent the interquartile 
range (50 percent of sample data found within top and lower quartile), top whiskers extend down from the ninetieth percentile 
(top decile) and bottom whiskers extend up from the tenth percentile (bottom decile), horizontal lines represent median and 
diamond symbols the mean. Letter n indicates number of models evaluated for each variable or combination set across the 
thirty-one models, when n = 27 (E, n = 15 (S, n = 8 (M), n = 7 (T), n = 41 (ES, n= 8 (EM), n= 4 (SM), n= 7 (ET), n=4 


(ST), n = 4 (ESM), and n = 3 (EST). 


Results 


Response to the removal of different types of filter 
variables (individually or combined) across all thirty- 
one species distribution models was highly variable, 
between extremes of zero to very high values (see in- 
terquartile ranges, Fig. 57.3). Median of the response 
across the distribution models showed that some filter 
variables or combinations were more influential than 
others were. When single filter variables were re- 
moved, soil was the most influential (216 percent in- 
crease, n = 15), while elevation was the least influen- 
tial (11 percent area increase, n = 27). Mountain range 
and temperature showed an intermediate influence 
(129 and 79 percent, n = 8 and n = 7, respectively). 
When sets of two were removed (i.e., across models in 
Groups 2 and 3), soil and mountain range produced 
the most difference (810 percent, n = 4), whereas ele- 
vation and temperature were the least influential (90 
percent, n = 7). When models permitted the removal 
of sets of three filter variables combined (i.e., across 
Group 3), the most influential combination was the set 
of variables elevation-soil-mountain range (856 per- 
cent increase, n - 4) (Fig. 57.3). 

Generally, we observed that response values of 


larger magnitude occurred after perturbing models 
that represented species with an original small distri- 
bution area (predicted from the full set of variables). 
In contrast, response values of smaller magnitude 
were observed after perturbing models representing 
species with larger original distribution areas (Fig. 
57.4). For example, when a single filter variable was 
removed, the largest response observed among all 
models was from the desert pocket mouse (Chaetodi- 
pus penicillatus) distribution model, a 1,355.4 percent 
area increase (Table 57.1). The original distribution 
predicted for this species (with the full set of variables) 
covered an area of approximately 4,811 square kilo- 
meters. By contrast, considering species for which re- 
moval of one filter variable produced no response, 
such as American kestrel (Falco sparverius), Lincoln's 
sparrow (Melospiza lincolnii), green-tailed towhee 
(Pipilo chlorurus), and dark-eyed junco (Junco hye- 
malis), their original distribution estimates covered 
139,268 square kilometers, 104,119 square kilome- 
ters, 102,726 square kilometers, and 227,241 square 
kilometers, respectively. These areas were all much 
larger relative to the area of the original distribution 
from the first species cited (Table 57.1 and Fig. 57.4). 


TABLE 57.1. 


List of all individual response values observed (percent area increase) organized by level of model 


perturbation (removal of different single filter variables, or by different sets of two or three). 


Distribution models 


No. Range Spp code 


Change in area (%) by removal of different filter variables 


Group 1 

1. (R) 
2. (W) 
3 (W) 
4. (R) 
5. (R) 
6. (W) 
T. (W) 
8. (W) 
9. (W) 
10. (R) 
sal (R) 
12. (R) 
Group 2 

Ae (R) 
2: (W) 
ek (R) 
4. (W) 
5. (R) 
6. (W) 
the (W) 
8. (W) 
9. (R) 
10. (W) 
Jide (R) 
12. (R) 
Group 3 

dt (R) 
pA (R) 
er (R) 
4. (R) 
5s (W) 
6. (W) 
The (W) 


Piab 
Caga 
Vuma 
Peta 
Meur 
Coco 
Thel 
Lage 
Chin 
Typa 
Thum 
Pepe 


Cova 
Came 
Chpe 
Locu 
Casi 
Sipy 
Stma 
Fasp 
Synu 
Vece 
Clga 
Lale 


Scpo 
Ocpr 
Crile 
Sona 
Meli 
Pich 
Juhy 


1 removed 
S 
5776 
73.0 
160.2 
14.0 
93.2 
3.8 
66.1 
266.5 
916.4 
360.7 
520.6 
1 removed 2 removed 
430.2 509.7 1442.9 
9:39. 2159 226.6 
24.7 1355.4 1546.6 
TOF 31S1010) 132.9 
11.4 79.1 2705 
14.3 27.4 45.7 
4.7 JRE 36.3 
0.0 89.6 89.6 
qe 2219774 248.7 
46:5 1235 600.9 
5.4 32.4 39.6 
Dele 2ds 390.7 
1 removed 2 removed 3 Removed 
11593 ee 2 26.0 163.5 22/74 1508673 1,042.8 
3.4 20:9 2 17918 46.2 186.3 278.8 345.7 
195 96.0 135.3 12185 159.8 6402 72185 
2102235 "246:2 224.2 260.0 980.2 990.5 
0.0 DEL 1031 2.4 4095 10615 106.5 
0.0 gH 2974 eral 129.9 KS 3 134.2 
0.0 182 dn 182 197 22:9) 2.9 


Notes: Model groups 1, 2, and 3 represent the number of filter variables in each model. 
Different response values of some models (groups 2 and 3) at same perturbation level were organized by increasing 


magnitude (across row) within each model. 
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Figure 57.4. Individual response values (percent area increase) observed when one filter variable was removed by model in 


relation to the original distribution predicted for each species (with full set of variables). Where for Group 1, n = 12 (single fil- 
ter variable removed from twelve models); for Group 2, n = 24 (one of two filter variables removed at a time from twelve mod- 


els); and for Group 3, n = 21 (one of three filter variables removed at a time from seven models). 


Similar tendency was observed when two or three fil- 
ter variables were removed. See Gonzalez-Rebeles 
(1996) for full details. 

Considering the lowest response values observed 
from the removal of single-filter variables across the 
thirty-one different animal distribution models, values 
ranged between extremes of null (American kestrel, 
Lincoln's sparrow, green-tailed towhee, and dark-eyed 
junco) to a 916 percent change in area (lesser prairie- 
chicken, Tympanuchus pallidicinctus) (see first column 
of data in Table 57.1). However, most of the response 
observed across models at this minimum perturbation 
level presented values nearer to the lower end of this 
range. Nineteen of the thirty-one models were less 
than a 15 percent change in area (Table 57.1, first 
column). 

Median of the response across all models (n = 31) 
when the least-influential filter variable was removed 
was an 11 percent increase in area, which did not dif- 
fer (p » 0.05) (Median test; Iman and Conover 1989) 
from the 5 percent threshold value set as the minimum 
acceptable level of change. Median response across 
each individual model type group (Groups 1, 2, or 3) 
when the least-influential variable was removed 
showed that only Group 1 (83 percent area increase, n 
= 12) was different (p « 0.05) from the 5 percent 


threshold value, whereas the median responses esti- 
mated across Group 2 and Group 3 (5 and 2 percent, 
n = 12 and n = 7, respectively) were not different (p > 
0.05) from the 5 percent threshold (Median test; Iman 
and Conover 1989). At other levels of model pertur- 
bation (i.e., removal of other single-filter variables or 
by sets of two or three combined), we observed much- 
higher response values that consequently were statisti- 
cally different from the 5 percent threshold. Group 3 
was an exception, where due to the combination of 
small sample size (n = 7) and extreme range in re- 
sponses (3 to 1,048 percent area increase), the test was 
not sufficiently powerful to detect differences (see 
Gonzalez-Rebeles 1996). 

Effects of individual variables differed relative to 
the presence or absence of other filter variables in the 
particular models. Results by type of filter variable re- 
moved for three potential combinations (R[X], or 
R[XIY] and R[YIX], or R[XIYZ], R[YIXZ], and 
R[ZIYX]), was highly variable (between extremes of 
null to very high response) (Fig. 57.5). Median of the 
response values, by type of filter variable, was larger 
for mountain range (401 percent increase in area, n = 
8), when it was considered the only filter variable in 
the model and it was removed. The next-largest re- 
sponse was observed when soil was removed (361 per- 
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Figure 57.5. Range in the response (percent area increase) across vertebrate-distribution models when single filter variables 
were removed relative to others present in model (E = elevation, S = soil, T = temperature, and M = mountain range). Boxes 
represent the interquartile range (50 percent of sample data found within top and lower quartile), top whiskers extend down 
from the ninetieth percentile (top decile) and bottom whiskers extend up from the tenth percentile (bottom decile), horizontal 
lines represent median, and diamond symbols represent the mean. Letter n refers to number of models evaluated for each 
potential case depending on combinations that their filter variables permitted, when n = 27 (E, n= 15 (S, n= 8(M),n- 7 


Crm |S\n -SYEIM), n =a (IB) n = 4 (SIM), n = 


8 (MIE), n = 4 (MIS, n2 7 (EIT), n2 3 (SIT), n = 3 (TIE), 


n. 3(T|S), n 2 4 (E|SM), n= 4 (SIEM), n = 4 (MIES), n = 3 (E|ST), n= 3 (SIE), n 2 3 (TIES). 


cent increase, n - 15), while elevation produced the 
least response (31 percent increase, n - 27). Tempera- 
ture produced an intermediate effect (90 percent in- 
crease in area, n - 7). Median of the response, when 
other filter variables were considered, showed a simi- 
lar relationship in size of the effects among the differ- 
ent filter variables removed. We observed higher val- 
ues when soil was removed given the presence of 
elevation (241 percent increase in area, n = 11) and 
when mountain range was removed given the presence 
of soil (184 percent increase, n = 4). The exceptions 
were two cases with very low response: (1) when soil 
was removed given the presence of temperature (2 per- 
cent increase, n - 3), and (2) when elevation and tem- 
perature were combined (2 percent increase, n = 3). 
We observed intermediate values with the removal of 
temperature for all potential combinations of other fil- 
ter variables (1 or 2) considered present in respective 
models. Lowest response values were observed when 
elevation was removed (Fig. 57.5). 

We observed a variation in the effect with filter 
variables present individually versus when the variable 
was removed given the number of other filter variables 
(1 or 2) were present in the model. For example, when 
soil was considered the only filter variable in models 


in which it was used and removed, median response 
across models produced a 361 percent increase in area 
(n = 15). When the soil variable was removed given 
the presence of elevation or mountain-range variables, 
median response across corresponding models was a 
250 percent (n = 11) and 172 percent increase (n = 4), 
respectively. When the soil variable was removed, 
given the presence of both elevation and mountain- 
range variables, median response was 170 percent (n = 
4) (Fig. 57.5). 

After polling all response values across each poten- 
tial combination case by the number of filter variables 
present or not in the model (R[1], R[111], and R [112]), 
median of the response by combination case showed a 
general decline for cases when a larger number of filter 
variables were present (i.e., 93 percent [n = 57], 26 
percent [n = 66], and 13.5 percent [n = 21] increases 
in area, respectively). Considering only the data esti- 
mated from Group 2, the removal of single filter vari- 
ables when no other was present, R(1), and removal 
of single filter variables when another is present, 
R(1l1), were different (p < 0.05) (Sign test; Conover 
1980). Median response estimated by removal case 
was a 155 percent increase (n = 24) and a 30 percent 
increase (n = 24), respectively. Considering the data 
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from Group 3 as well, contrast between removal cases 
R(1), R(1l1), and R(112) showed differences (p < 0.05) 
among all three cases (Friedman test; Conover 1980). 
Median response by removal case was 81 percent (n = 
21), 25 percent (n = 42), and 13 percent (n = 21), re- 
spectively. Consistency of these results with the previ- 
ous analysis (contrasted with a 5 percent threshold) 
provides further indication of potential interaction 
when adding more than one filter variable. 

This analysis of the effect by filter variable relative 
to the presence or absence of other variables provided 
insights useful for identifying interacting variables and 
combinations. For example, in comparing Figure 57.3 
with Figure 57.5 it can be seen that the largest median 
response was produced when the variable set eleva- 
tion/soil/mountain range were removed in combina- 
tion, followed in magnitude by the removal of the 
variable set soil/mountain range (Fig. 57.3). This pat- 
tern suggested the fewest problems of correlation 
when these variables were used in combination. Simi- 
lar results (high response) were observed when the ef- 
fects of the variables soil and mountain range were es- 
timated relative to the potential presence of respective 
variables with each other (i.e., SIM, MIS) and with el- 
evation (i.e., SIE, MIE, SIEM, and MIES) (Fig. 57.5). 
This pattern further indicated that these variables pro- 
duced less correlation whenever they were combined. 

Conversely, the combination sets of filter variables 
assumed to present more problems with correlation 
among themselves were  elevation/temperature, 
soil/temperature, and elevation/soil/temperature (Fig. 
57.3). Considering the effect of the variable soil rela- 
tive to the presence of the variable temperature, or of 
elevation and temperature combined, we found ex- 
tremely low response values, indicating a high degree 
of spatial correlation between them (Fig. 57.5). The 
same was observed with elevation in the presence of 
temperature, or soil and temperature (Fig. 57.5). 
However, the effect of temperature relative to the pres- 
ence of elevation, soil, or elevation and soil combined, 
median response suggested no correlation in general 
of temperature with these two variables (Fig. 57.5). 
This in turn, suggested that the response observed 
from perturbation cases when the sets soil/tempera- 
ture and elevation/soil/temperature were removed 
(Fig. 57.3) was mainly produced by the sole effect of 


temperature, given that the other variables were corre- 
lated with this. Full data sets and analyses have been 
described elsewhere (Gonzalez-Rebeles 1996). 


Discussion 


Modeling vertebrate distributions is subject to uncer- 
tainty, with potential errors in sources of information, 
the relationships defined, or model structure (Marcot 
et al. 1983; Morrison et al. 1998). Factors and inter- 
actions that determine habitat suitability for wildlife 
species are generally not well known. Parameters used 
are biased toward those easily measured and for 
which the species are assumed to be most responsive 
(Marcot et al. 1983; Schamberger and O’Neil 1986; 
Verner et al. 1986a). Additionally, sources of error, 
how they are transferred through the modeling 
process, and how they are expressed in the final prod- 
ucts are difficult to determine. This is especially true 
for GIS-habitat models (such as those used for gap 
analysis) that combine spatial or nonspatial informa- 
tion from different sources through several operations 
of overlaying and map transformations (Lodwick et 
al. 1990; Stoms et al. 1992; Edwards et al. 1995). 
Our research reported here should not be consid- 
ered an attempt to validate NMGAP vertebrate distri- 
bution models. Model validation involves intensive 
field sampling procedures that are expensive and time 
consuming. For a review of issues about habitat- 
model uncertainties and validation theory, see Marcot 
et al. (1983), Verner et al. (1986b), and Morrison et 
al. (1998). See also Karl et al. (Chapter 51), Schaefer 
and Krohn (Chapter 36), and Fielding (Chapter 21) 
for specific discussions on validation problems related 
to detection and interpretation between apparent and 
actual errors. Model reliability can also be assessed 
through sensitivity analysis, when other validation 
procedures are not practical or possible (Heinen and 
Lyon 1989; Lodwick et al. 1990; Lyon et al. 1987; 
Stoms 1992; Stoms et al. 1992). Our work involved a 
"sensitivity analysis;" however, it was not addressed 
in the traditional sense of evaluating model robustness 
(or specific sensitivities) to controlled alteration of 
model parameters or modification of relationships 
among model elements (Grant 1986) but instead eval- 
uated the effect of excluding different information 
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layers (see Lodwick et al. 1990 for a description of the 
different approaches for geographic analyses). For this 
specific case, we strove to evaluate the effect of filter 
variables used for distribution models as an indirect 
measure of their contribution. 

Animal distribution models in NMGAP are based 
on extensive literature review and expert consultation. 
From the information available on species-habitat as- 
sociations, only those variables that can be repre- 
sented spatially (digitized) or considered appropriate 
to characterize species habitats are considered for 
modeling (Thompson et al. 1996). Based on system- 
atic selection of variables and expert review at differ- 
ent stages, we expected the modeling approach would 
produce reasonably accurate distribution estimates (at 
the desired landscape scale and assuming a correct 
model). The effect measured from the different combi- 
nation of filter variables was considered as the level of 
“additional adjustment" provided to the “basic 
model" (i.e., within known occurrences and land- 
cover associations). In our example using the western 
banded gecko (Coleomyx variegatus) (Figs. 57.1 and 
57.2), the final model entailed a 430 percent change in 
area versus removal of elevation, yet it accurately pre- 
dicted all thirteen known field locations for the species 
(Degenhardt et al. 1996). In this case, the contribution 
of elevation combined with base variables and the 
other filter variable in the model (land cover, presence 
by county within watershed limits, and soil), provided 
the expected refinement and adjustment (assuming 
that known associations for base and filter variables 
are correct). Another example is that of the distribu- 
tion model for desert pocket mouse. The removal of 
the variables soil and elevation produced 1,355.4 per- 
cent (maximum response when a single filter variable 
is removed, see Results above) and 24.7 percent of 
change in area respectively, indirectly meaning it re- 
sulted in a final model that included thirty of thirty- 
one locations reported by Findley et al. (1975). 

Results should be interpreted relative to the efficacy 
of using those different combinations of variables. All 
filter variables included in a model should produce an 
effect to be useful, and this is expected whether one, 
two, or three filter variables are used. The contrary 
would indicate some sort of interaction among vari- 
ables that consequently could be interpreted as the 


variable or combination not being useful. Our com- 
parisons among model groups were intended to iden- 
tify combinations with potential problems (interac- 
tions) and the pattern of response was examined for 
insights about whether the model performed as ex- 
pected. Nevertheless, real direction of this effect (cor- 
rect adjustment) and the optimum combination of fil- 
ter variables can only be evaluated from field 
verification. 

For example, white-tailed ptarmigan (Lagopus leu- 
curus) and the dwarf shrew (Sorex nanus) have been 
associated with montane habitats (Bailey 1928; Ligon 
1961; Findley et al. 1975). However, the first species 
modeled (with two filter variables) showed a large re- 
sponse when elevation was removed (212 percent). 
Response was almost null to the removal of mountain 
range (2 percent). By contrast, the dwarf shrew model 
(with three filter variables) showed large responses 
when soil (246 percent) or mountain range (223 per- 
cent) was removed. In this case, the least response was 
observed when elevation was removed (2 percent). For 
the first species, the smaller effect of mountain range 
was potentially due to some sort of correlation of this 
variable with elevation and/or other base variables. It 
is probable that limits defining the mountain-range 
polygons, used to predict its distribution, were located 
at lower elevations than those represented by the ac- 
tual elevation variable used (possibly nested within 
mountain-range polygons). In the case of the dwarf 
shrew model, the combination of two filter vari- 
ables—suitable soil associations within mountain- 
range polygons—were the ones that appeared to con- 
trol this estimate relative to elevation. 

Traditionally, various parameters have been consid- 
ered biologically significant for spatial analysis of the 
landscape, such as area measurements, fractal dimen- 
sion, and indices to assess shape, contiguity, and dis- 
persion patterns (O'Neill et al. 1988b; Turner et al. 
1989b; LaGro 1991). We did not prepare maps for all 
perturbation levels applied to all species, thus restrict- 
ing some spatial analyses. Area is considered a signifi- 
cant measure because it directly reflects the degree of 
spatial alteration produced from the original distribu- 
tion patch predicted (full set of variables) when a per- 
turbation is applied to the model (removal of filter 
variables). 
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However, not much is known about the size of the 
effects that different filter variables contribute to distri- 
bution models. Initial gap analysis research evaluated 
the effect of modifying different variables using several 
trial models for different species (Scott et al. 1993; But- 
terfield et al. 1994; Csuti and Crist 1998). In particu- 
lar, Butterfield et al. (1994) reported how the use of fil- 
ter variables improves distribution estimates for 
different taxa. However, none of these research efforts 
measured the size of changes produced. Other studies 
examined the sensitivity of richness mapping to habitat 
generalization and errors in vegetation classification 
(Stoms 1991), to variations in the resolution of 
mapped habitats with fixed sampling unit sizes (Stoms 
1992), and to variations in sampling unit size with 
fixed resolutions for the habitats mapped (Stoms 
1994). In these cases Stoms (1991, 1992, 1994), sim- 
plified the models in the traditional sense of “sensitiv- 
ity analysis" (as mentioned above) by altering informa- 
tion about base variables (not removing complete 
subsets of filter variables). In other words, the author 
assessed taxonomic and resolution sensitivities (see 
Lodwick et al. 1990) but did not quantify levels of 
change in area of the estimates. Thus, there is little 
published information for comparative evaluation of 
our findings. 

Magnitudes of the measured responses indicated 
some relationship with size of the original distribu- 
tion estimates (not perturbed model) of the species 
sampled. This was a function of the arithmetic rela- 
tionship established by calculating proportions of 
change, which is unavoidable in this kind of data. 
However, in addition to the arithmetic relationship, 
the effect of interest (by filter variable, relative to 
combination used and species modeled) was still con- 
sidered measurable by the fact that both large and 
small values were observed for species models with 
similar original distribution size (with exception at 
the extremes of larger distributions) (Fig. 57.4). For 
example, with one variable removed, maximum re- 
sponse (1,355 percent) observed was for a species 
(desert pocket mouse) whose original distribution 
was small (4,811 square kilometers). At the other ex- 
treme, two species that also had relatively small dis- 
tributions, the white-tailed ptarmigan (2,326 square 
kilometers) and the dwarf shrew (5,325 square kilo- 


meters), presented very small responses (2 percent) 
when the mountain range and elevation variables, re- 
spectively, were removed. 

The large variability in the response of the models 
to perturbation was expected because of the nature 
of model structure and function. A variety of poten- 
tial interrelationships may occur by combining dif- 
ferent filter variables, each corresponding to particu- 
lar environmental elements (represented spatially as 
specific polygons) that defined the habitat associa- 
tions. However, our results indicated that some filter 
variables were more influential than others across 
the models. 

Our selection of a 5 percent threshold in area 
change to evaluate the level of contribution from filter 
variables can be questioned, but the 5 percent thresh- 
old has some biological significance in conservation 
planning considering the size of the original vertebrate 
distribution estimates (complete set of variables). Sizes 
ranged from a minimum area of 1,984 square kilome- 
ters for the white-ankled mouse (Peromyscus pec- 
toralis) to a maximum of 227,241 square kilometers 
for the dark-eyed junco. Five percent from each of 
these areas will then represent values ranging from 99 
to 11,365 square kilometers. The minimum reserve 
size required to maintain a viable population for small 
mammals is estimated to be from 10 to 100 square 
kilometers and for larger mammals from 10,000 to 
100,000 square kilometers (Schonewald-Cox et al. 
1983). This means that 5 percent of the smallest dis- 
tribution (1,984 square kilometers) represents an area 
that is approximately the minimum reserve size re- 
quired for small mammals (upper limit) and 5 percent 
from the largest distribution (227,241 square kilome- 
ters) represents the minimum size required for large 
animals (lower limit). 

However, the 5 percent threshold was an arbitrary 
value determined to set a minimum limit to evaluate if 
filter variables were useful. It could have been set at 
any level, depending on the efficacy attempted and 
taking into consideration typical trade-offs between 
efficiency and the costs inherent to the project. (Each 
variable added would represent a new thematic layer 
in the GIS that would require time and cost for devel- 
opment and processing.) Filter variables producing a 
response value equal to or below this threshold were 
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assumed to contribute insufficient information to im- 
prove the model. By contrast, all levels of response 
above the threshold were assumed to indicate an ade- 
quate contribution. 

Specific weights of relevance between differences 
in magnitudes of response above this level (5 per- 
cent) were not considered further because of the 
arithmetic relationship between the magnitude of re- 
sponse and area of distribution, as discussed before. 
Thus, contribution was evaluated by differentiating 
between levels of response versus no response (above 
or below the threshold). This 5 percent threshold 
was useful for determining that some of the filter 
variables did not contribute valuable information for 
some models (assumed adjustment) and that this 
problem mainly occurred for model types formed by 
two- or three-filter variables, due to potential spatial 
correlation. 

A relevant issue for modeling is that of balancing 
the cost and effort of model building with the quality 
of the output. Stoms (1991, 1992) reported that these 
types of GIS-habitat models are not robust enough in 
regard to variations in the information about habitat 
(as represented by land cover, a base variable), re- 
flected as variations of species richness mapping. Our 
results further indicated a limit to the usefulness of fil- 
ter variables when applied in increasing numbers, sug- 
gesting that special care be taken regarding the quality 
of information about habitat associations and the se- 
lection of filter variables. 


Conclusions 


Results indicated that most filter variables presented 
adequate levels of influence when used in the models 
(required adjustment), although some were more in- 
fluential than others. Our findings also suggested 
that special care must be taken when applying more 
than two filter variables due to potential spatial cor- 
relation as the number of filter variables increase in 
the model. However, model developers will not 
know at the outset which of the variables they plan 
to use will be best and therefore cannot remove less- 
influential variables beforehand. There is no way to 
know a priori how these variables will interact spa- 
tially. Due to the nature of this modeling approach, it 


is clear that all filter variables combined produce a 
cumulative effect by model (e.g., see the effect of 
combining two- and three-filter variables in Table 
57.1). However, individual effects are difficult to 
predict and will depend on the particular spatial ex- 
tent each one depicts. 

A practical recommendation is to apply a test sim- 
ilar to this type of sensitivity analysis to evaluation 
of distribution models at the stage when the prelimi- 
nary maps are being reviewed by experts. This sim- 
ple method will provide enlightening information by 
detecting all potential types of models and variables 
for which major problems may arise. For example, in 
the case of NMGAP, the variable elevation, followed 
by temperature, showed the most correlation 
problems, as did models with three filter variables. 
Temperature was among the filter variables elimi- 
nated from final vertebrate distribution modeling 
and maps production (Thompson et al. 1996). For a 
different study area, problem detection will be con- 
current with the spatial representation of habitat fea- 
tures in that particular region and model specific 
characteristics. 

We undertook our research in the context of gap 
analysis, but we evaluated a standard modeling ap- 
proach that is used in many species modeling endeav- 
ors. Our results provide quantitative evidence for the 
influence of filter variable combinations. This insight 
can be useful to other researchers, serving as a refer- 
ence about the potential effects of the variables used 
for their own distribution estimates. Our intent was 
not to estimate absolute accuracy of such modeling, 
but to illustrate the relative efficacy and efficiency in- 
herent in using filter variables in such predictive mod- 
eling. We hope our experiences with trying to under- 
stand predictive modeling of animal distribution for 
biodiversity can aid in developing the philosophy of 
biodiversity in conservation as encouraged by Calli- 
cott et al. (1999). 
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Species distribution models sampled from New Mexico Gap Analysis Project. Model groups 1, 2, and 3 indicate the 


corresponding numbers of filter variables each include. 


Species Habitat variable categories 
Range Name Base variables Filter variables 
Group 1 
dL. — (sj) Pipilo aberti Cnty LaCo Elvt 
2. (W) Callipepla gambelii Cnty/Wshd LaCo Elvt 
3. (W) Vulpes macrotis Cnty/Wshd LaCo Elvt 
4.  (R) Peucedramus taeniatus Cnty/Wshd LaCo Elvt 
ioc — (dy Melanerpes uropygialis Cnty/Wshd LaCo Elvt 
6. (W) Coluber constrictor Cnty/Wshd LaCo Elvt 
T. (W) Thamnophis elegans Cnty/Wshd LaCo Elvt 
8. (W) Lampropeltis getulus Cnty/Wshd LaCo Elvt 
9. (W) Chaetodipus intermedius Cnty/Wshd LaCo Soil 
10. (R) Tympanuchus pallidicinctus Cnty/Wshd LaCo Soil 
TIIR) Thomomys umbrinus Cnty/Wshd LaCo Soil 
12 YR) Peromyscus pectoralis Cnty/Wshd LaCo Soil 
Group 2 
1. (R) Coleonyx variegatus Cnty/Wshd LaCo Elvt Soil 
2. (W) Catherpes mexicanus Cnty/Wshd LaCo Eivt Soil 
e. R) Chaetodipus penicillatus Cnty/Wshd LaCo Elvt Soil 
4. (W) Loxia curvirostra Cnty/Wshd LaCo Elvt Soil 
5. (R) Cardinalis sinuatus Cnty/Wshd LaCo Elvt Temp 
6. (W) Sitta pygmaea Cnty/Wshd LaCo Elvt Temp 
T. |. (W) Sturnella magna Cnty/Wshd LaCo Elvt Temp 
8. (W) Falco sparverius Cnty/Wshd LaCo Elvt Temp 
9. (R) Sylvilagus nuttallii Cnty LaCo Elvt MntR 
10. (W) Vermivora celata Cnty LaCo Elvt MntR 
LIR) Clethrionomys gapperi Cnty LaCo Elvt MntR 
422. — (Ry Lagopus leucurus Cnty LaCo Elvt MntR 
Group 3 
tor (R) Sceloporus poinsettii Cnty LaCo Elvt Soil MntR 
2 (ln) Ochotona princeps Cnty LaCo Elvt Soil MntR 
So (RE Crotalus lepidus Cnty LaCo Elvt Soil MntR 
4.  (R) Sorex nanus Cnty LaCo Elvt Soil MntR 
5. (W) Melospiza lincolnii Cnty/Wshd LaCo Elvt Soil Temp 
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Species distribution models sampled from New Mexico Gap Analysis Project. Mode! groups 1, 2, and 3 indicate the corresponding 


numbers of filter variables each include. 


Species Habitat variable categories 
Range Name Base variables Filter variables 
6. (W) Pipilo chlorurus Cnty/Wshd LaCo ^ EM Soil Temp 
7. W) Junco hyemalis Cnty/Wshd LaCo Elvt Soil Temp 


Variable key 


Cnty = Species presence by county 

Cnty/WShd =" Presence by county extrapolated to limits within all intersected watersheds 
LaCo = Land cover associations 

Elvt = Association with elevation 

Soil = Association with soil type 

Temp = Association with temperature gradient 

MntR = Association with mountain range 


Species range 
W =  Widespread 
R Restricted 
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A Distribution Model for the Eurasian Lynx 
(Lynx lynx) in the Jura Mountains, Switzerland 


Fridolin Zimmermann and Urs Breitenmoser 


he lynx (Lynx lynx) populations of western and 
southern Europe disappeared during the eigh- 
teenth and nineteenth centuries as a consequence of 
direct persecution, alteration of the ecosystem (forest 
destruction and expansion of cultivated land), and ex- 
cessive reduction of wild ungulates (Breitenmoser 
1998). Since the end of the nineteenth century, forests 
have regenerated in many mountainous region of Eu- 
rope, and the wild ungulate populations have recov- 
ered quickly. This improvement in the ecological con- 
ditions also inspired the idea to bring back large 
predators. Lynx were re-introduced to the Swiss Alps 
and the Swiss Jura Mountains in the 1970s (Breiten- 
moser et al. 1998). Although the Swiss reintroductions 
are considered to be rare examples of successful 
translocations of large predators (Yalden 1993), these 
small populations cannot yet be regarded as viable. In 
the Swiss Jura Mountains, only the southern half of 
the range is permanently occupied by lynx. The rea- 
sons for the lack of vitality are not known; they may 
include ecological, anthropogenic, and intrinsic (ge- 
netic) factors. However, habitat suitability analyses 
were never carried out for the Jura Mountains, al- 
though such a tool is recognized to be important for 
reintroduction programs. 
The purpose of this study is to assess small-scale 
habitat variables and their importance to lynx recolo- 
nization of the whole Jura Mountains and to estimate 


available lynx habitat throughout the mountain range. 
We used a geographic information system (GIS) to de- 
termine if easily available spatial data can successfully 
describe lynx habitat and contribute to a predictive 
spatial model (see Guisan and Zimmermann [2000] 
for a review). The model was built using data from 
adult, resident lynx that were followed by means of 
radiotelemetry in the southern part of the Swiss Jura 
Mountains. We then extrapolated the model over the 
entire Swiss Jura Mountains and evaluated the relia- 
bility of the resulting maps using radio fixes from dis- 
persing subadult lynx. Such a spatial model permits 
prediction of the future distribution and the potential 
size of the lynx population in the Jura Mountains and 
could be of use in drafting a lynx conservation plan 
for this mountain range. 


Study Area 


The study was performed in the Jura Mountains, a sec- 
ondary limestone mountain chain forming the north- 
western border of Switzerland with France (Fig. 58.1). 
The altitude varies from 372 meters (Lake of Geneva) 
to 1,679 meters (Mont Tendre). The main study area 
(680 square kilometers) was confined to the northern 
part of the canton of Vaud (VD). Lynx were also fol- 
lowed into the adjoining areas of the canton of Neuchá- 
tel (NE) and into France; this total area is approxi- 
mately 3,000 square kilometers. Deciduous forests 
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Switzerland 


10 Kilometers 


Figure 58.1. Situation of the study area in the Jura Mountains 
of Switzerland (small map) and France. The grid (large map) 
shows the 1,085 quadrats where the predictor and response 
variables have been sampled. Cells are hatched as follows: \ = 
celis visited by female lynx; / = visited by male lynx; X = visited 
by both sexes. In addition, S = southern, C = central, and N = 
northern part of the Swiss Jura Mountains. 


along slopes and coniferous forests on the ridges cover 
53 percent of the highlands. Cultivated areas are typi- 
cally pastures. The human population reaches a density 
of 120 per square kilometer in most parts of the Jura 
Mountains, and people living on the Swiss Plateau use 
the highlands intensively for recreation. The center of 
the study area is crossed by two railways, a highway, 
and some additional roads with dense traffic. As in the 
Swiss Alps (Breitenmoser and Haller 1987; Haller 
1992), Western roe deer (Capreolus capreolus), and 
chamois (Rupicapra rupicapra) are the main prey of 
lynx in the Jura Mountains (Jobin et al. 2000). 


Method 


From April 1988 to June 1998 a total of twenty-nine 
lynx were surveyed by means of radiotelemetry in the 
Jura Mountains (Breitenmoser et al. 1993; Swiss Lynx 


Project unpublished data). Some of the radio-tagged 
lynx roamed into the French part of the mountain 
range. All analyses for this study were done using the 
radio fixes from the Swiss part of the study area, be- 
cause environmental data for France were unavailable. 

We used a total of 6,282 radio fixes of eleven resi- 
dent lynx followed from 1988 to 1998 to generate the 
models on the assumption that these adult, territorial 
individuals (Breitenmoser et al. 1993) would occupy 
the best habitat. The sample unit was a 1x1-kilometer 
quadrat. The sampling area was restricted to all the 
quadrats intercepted by the minimum convex polygon 
(MCP), including all the fixes of the resident lynx. 
Quadrats falling within France were disregarded. A 
total of 1,085 quadrats remained for the analyses (Fig. 
58.1). We split the data in two subsets. One was used 
for calibration of the model, and the other was used to 
evaluate the model predictions (split sample approach; 
see Guisan and Zimmermann 2000). 

We compared the results from different sample 
sizes ranging from two hundred to one thousand 
quadrats for the calibration of the model in order to 
test the consistency (stability) of our model. Since all 
samples greater than three hundred quadrats pro- 
duced the same parameters, we decided on a sample of 
four hundred quadrats to calibrate our model. The 
quadrats were chosen randomly with a distance con- 
straint between them in order to reduce spatial auto- 
correlation. The remaining 685 quadrats were then 
used to evaluate the model. 

The response variable is the presence/absence of 
lynx in each quadrat. Lynx were considered to be 
present in each quadrat containing one or more 
telemetry fixes. From this set of quadrats, three sets 
were prepared (Fig. 58.1), using radio fixes of (1) all 
lynx (females and males), (2) females only, and (3) 
males only. 

The eighteen predictor variables (Table 58.1) were 
selected from among all statistical parameters avail- 
able according to our empirical knowledge of the 
lynx’s ecological requirements, but also with respect to 
their availability in digital form. A previous study of 
lynx recolonization of the Jura Mountains (Breiten- 
moser and Baettig 1992), based on random observa- 
tions, had shown that the lynx distribution was a pri- 
ori determined by the extension of the forest and 
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human activity. Each of these factors can be described 
in terms of many concurrent environmental predictors 
and can be correlated to the lynx distribution. The en- 
vironmental predictors came from the database of the 
Federal Office of Statistics and from the database of 
the Federal Office of Topography. Both databases had 
an accuracy of 1 hectare and were in digital form and 
ready to be analyzed using GIS ArcView software 
(ESRI 1996a,b,c). From the hectare information, we 
then computed a summary statistics to each square- 
kilometer quadrat: (1) the proportion of the different 
land-use predictors, and (2) the mean value in the case 
of the quantitative predictors( fringe length, elevation, 
declivity, human population density, exposure of the 
slope [predictors 1-18 in Table 58.1]). 

General linear models (GLM; McCullagh and 
Nelder 1989; see Nicholls 1989) were used to select 
those predictors that best explained the presence/ab- 
sence of lynx. All the analyses were computed in S- 
PLUS (MathSoft) according to the method described 


TABLE 58.1. 


Sources of the eighteen predictors used in the logistic 
regression analysis. 


Predictor Units Sources? 
Forest areas ha/km2 GEOSTAT 
Other wooded areas ha/km2 GEOSTAT 
Fringe length meter FOT 
Horticulture, viticulture ha/km2 GEOSTAT 
Arable land, meadows ha/km2 GEOSTAT 
Pastures ha/km2 GEOSTAT 
Pastures in mountain areas ha/km2 GEOSTAT 
Lakes and rivers ha/km? GEOSTAT 
Nonproductive vegetation ha/km? GEOSTAT 
Areas without vegetation ha/km? GEOSTAT 
Built-up areas ha/km2 GEOSTAT 
Rest areas, parks ha/km? GEOSTAT 
Roads and railways ha/km? GEOSTAT 
Elevation meter GEOSTAT 
Slope degree GEOSTAT 
Eastness (cosinus) GEOSTAT - 
Northness (sinus) GEOSTAT 
Human population density ind./ha GEOSTAT 


aPredictors had an accuracy of 1 hectare. 
bGEOSTAT database of the Federal Office of Statistics and FOT — 
Vector 200 database of the Federal Office of Topography. 


in Guisan et al. (1999). To facilitate the final ecologi- 
cal discussion of the model, we did not orthogonalize 
the predictors (e.g., through principal component 
analysis) prior to the model calibration. Predictors 
were only selected when they significantly contributed 
to the deviance reduction, as attested by a y2-test (p- 
value < 0.05). In addition, we did not retain the pre- 
dictors that explained less than one percent of the 
total deviance to avoid having predictors with few or 
no biological meaning appearing in the final model. 

We used the receiver operating characteristic (ROC; 
see Fielding, Chapter 21), a threshold-independent 
measure of accuracy, to evaluate our models. An ROC 
plot is obtained by plotting the trüe positive propor- 
tion on the y-axis against the false positive proportion 
on the x-axis. The area under the ROC function 
(AUC) is usually taken as the index of performance 
because it provides a single measure of overall accu- 
racy independent of any particular threshold in the 
training data (Fielding, Chapter 21). Final GLMs were 
fitted and evaluated using custom S-Plus functions 
(written by A. Guisan). 

We compared the three lynx distribution maps by 
subtracting the computed probabilities of lynx pres- 
ence for each quadrat in the GIS: (1) total (both sexes 
combined) minus female, (2) total minus male, and (3) 
female minus male. Values close to -1 or +1 indicate a 
high discrepancy between corresponding grid cells, 
whereas values close to 0 indicate a high conformity. 

We extrapolated the resulting model over the entire 
Swiss Jura Mountains in the GIS. GLM models are 
readily implemented in a GIS by building a single for- 
mula in which each coefficient multiplies its related 
predictor variable (Guisan et al. 1999). The results of 
the calculations are obtained to the scale of the linear 
predictor so that the inverse logistic transformation is 
then necessary to obtain probability values between 0 
and 1 at every quadrat of the grid. Finally, we evalu- 
ated the resulting models with the spatial behavior of 
dispersing subadult lynx. 


Results 


The proportion of deviance significantly explained 
(adj-D2) in the models ranged from 0.39 to 0.44, cor- 
responding to a medium fit of the models (both sexes 
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TABLE 58.2. 


Results of the GLM analyses with the three different sets of response variables. 


Calibration Evaluation 
Proportion of 
explained 
Response variables GLM formulas variance AUC AUC 
Presence/absence of lynx elev2 + slo + forest 0.435 0.89 0.88 
Presence/absence of females elev2 + slo + forest 0.386 0.87 0.84 
Presence/absence of males elev + slo + forest + roads 0.423 0.90 0.89 


pooled, females, males). The AUC at calibration and 
evaluation ranged between 0.84 and 0.90 (Table 
58.2). Three out of eighteen predictors were selected 
in the final model when presence/absence data of both 
sexes were used. They were elevation (second-order 


TABLE 58.3. 


polynomial; 20 percent of the deviance explained), 
slope (20.5 percent), and forest (3.4 percent). The 
same predictors were retained when presence/absence 
data from female lynx were used to build the model 
(elevation second-order polynomial 16.3 percent, 


Survival of subadult lynx (F = females, M = males) according to their habitat use. 


Probability class® 


Model? Lynx 1 2 3 4 5 Destiny 
"5 F12 0.0 4.4 2.2 3.3 90.1 Survived 
3 M14 13.3 3.5 $2.1 122 59.0 Survived 
= F23 12.1 2:1 19.7 7.6 56.5 Survived 
x F26 0.0 1.8 21.4 25.0 51.8 Survived 
E F13 0.0 17.1 I2 28.6 37.1 Died 
8 F17 0.0 0.0 111 66.7 2o Died 

M16 66.8 4.8 11.9 2.3 14.2 Died 
F12 0.0 2.3 4.3 6.6 86.8 Survived 
M14 8.1 6.9 6.9 14.5 63.6 Survived 
E F23 3.6 5.8 10.3 8.5 71.8 Survived 
E F26 0.0 24 78 380 51.8 Survived 
= F13 8.6 5 20.0 37.1 28.6 Died 
F17 0.0 222 44.4 33.3 0.0 Died 
M16 61.9 pál 16.7 2i 11.9 Died 
F12 4.4 3.3 6.6 6.6 79.1 Survived 
M14 14.5 9.8 13.3 16.2 46.2 Survived 
9 F23 24d 8.5 1.8 20.6 48.0 Survived 
z F26 2.4 6.0 19.6 44.0 28.0 Survived 
F13 8.6 T0 34.3 T1 28.7 Died 
F17 112 22:2 44.4 222 0.0 Died 
M16 69.1 19.0 0.0 4.8 7 Died 


?The percentage of radio fixes of the subadults during their first year of independence falling into 
the different lynx habitat probability categories is shown for each response variable set. 
bProbability class: 1: 0-0.2; 2: 0.2-0.4; 3: 0.4-0.6; 4: 0.6-0.8; 5: 0.8-1. 
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slope 19.9 percent, and forest 3.4 percent, respec- 
tively). Four predictors were selected when we used 
presence/absence data from males. Here, elevation ex- 
plained 17.4 percent, slope 20.2 percent, forest 4.2 
percent, and roads 1.4 percent of the deviance. 

The comparison of the resulting probabilities 
showed a high conformity between the three distribu- 
tion maps. The subtractions of the probability values of 
most grid cells gave results close to zero. The differences 
between the probabilities of the 1,085 grid cells from 
the female versus the male distribution ranged from 
—0.34 to 0.33. More than 90 percent (978 cells), how- 
ever, had values between —0.2 and 0.2. When subtract- 
ing the female lynx distribution probability map from 
the total map, all grid cells had a positive value less 
than 0.12. The differences for the male versus total map 
comparison ranged from —0.31 to 0.45, with 68 percent 
(735 cells) falling into the class from —0.2 to 0.2. 

We then extrapolated the outcome of the three dis- 
tribution probabilities over the entire Swiss Jura 
Mountains (see Fig. 58.2 in color section). The map of 
the potential distribution for males shows the most re- 
strictive potential distribution, whereas the map for 
the females and for both sexes combined showed 
larger areas in the higher probability classes. 

Maps of potential lynx distribution were based on 
the telemetry locations of resident lynx. As a supple- 
mentary evaluation of the models, we investigated the 
survival of young, dispersing lynx according to their 
habitat use. The lynx is a solitary, territorial species, 
and subadult lynx have to leave the parental home 
range at the age of about ten months (Breitenmoser et 
al. 1993). One can predict that subadult lynx can only 
establish a permanent home range if they find free 
space; otherwise, they would be driven into sub- 
optimal habitat. Each subadult lynx revealed an indi- 
vidual fate, although the tendency observed was con- 
sistent with our habitat model: two subadult lynx 
(M14, F12 in Fig. 58.2) dispersed north from our 
study area into the still-unoccupied part of the Jura 
Mountains (Breitenmoser and Baettig 1992; Capt et 
al. 1998). They traveled along the corridors predicted 
from the habitat model (Fig. 58.2), and both settled in 
good-quality habitat (Table 58.3). F12 was poached 
one year after independence. Two other young lynx 
(F23, F26) were able to establish home ranges in 


good-quality habitat inside the study area (Table 
58.3). Both had taken over the home ranges of their 
mothers after the deaths of the latter. The subadult fe- 
male (F17) was killed by a car during her dispersal. Fi- 
nally, the locations of F13 and M16 showed a high 
share of suboptimal habitat (Table 58.3). Both lynx 
died from a natural death during the dispersal—F13 
after she had left a temporary home range in marginal 
habitat (Fig. 58.2). 


Discussion 


Our models do not identify single variables but rather 
the combination of variables limiting lynx distribu- 
tion. Different combinations of variables can result in 
the same probability of presence. Slope and elevation 
were the most powerful variables predicting lynx 
presence/absence in the three models. This is not so 
much typical for the lynx, which lives in a large part 
of its distribution area in lowland forests, but was for 
our study area, where forested areas are correlated 
with elevation and slope as a result of human activi- 
ties. This observation underlines the local nature of 
our models and shows that the selected variables do 
not necessarily have a biological value for the species 
in question, as discussed by Guisan and Zimmermann 
(2000). Consequently, such models should only be ap- 
plied to regions similar to those where the basic data 
were originally gathered. The human impact on carni- 
vores is extremely difficult to evaluate, although today 
this is the main factor limiting their distribution (Boi- 
tani and Cuicci 1993; Mladenoff et al. 1995; Corsi et 
al. 1998). It is not a simple variable, nor can its distri- 
bution be easily mapped. In our model, we suppose 
that the human impact is included in other variables 
such as road density, human population density, or 
land use. Even in areas of generally good habitat, 
roads, which have a limited spatial extension and 
seem not to reduce the habitat quality considerably, 
can be a risk factor, as demonstrated by the fate of 
F17 (Fig. 58.2). Failure to incorporate such spot-like 
or linear, but critical, habitat features or ecological 
factors such as prey availability, competition, preda- 
tion (Pearce et al., Chapter 32) and the like can lead to 
prediction errors. Data on number and distribution of 
roe deer and chamois, the main prey of lynx in the 
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study area (Jobin et al. 2000), are presently not avail- 
able in adequate form or precision to be incorporated 
into a habitat model. However, as ungulate distribu- 
tion is habitat dependent, too, we can assume that the 
presence/absence approach of lynx at least partly re- 
flects prey availability. 

Most classifiers assume that class membership is 
known without errors (Fielding, Chapter 21). Lynx 
were not located in all favorable lynx zones within the 
study area, because peripheral spots of good habitat (1) 
might not be connected to the lynx zone, (2) might be 
occupied by neighboring lynx, or (3) surveillance den- 
sity might have been insufficient. It is a shortcoming of 
our method that the defined categories (presence/ 
absence) are not exclusive. Assuming absence of a 
species where it was actually present is a type II error 
that could be corrected with adequate sample size and 
monitoring duration to increase the power of the statis- 
tical evaluation (Morrison et al. 1998). To minimize 
this error, we restricted our sample area to the southern 
portion of Swiss Jura Mountains, where lynx were fol- 
lowed most intensively by telemetry. 

The model provides a tool for the conservation and 
the management of the lynx in the Jura Mountains. 
An early study by Breitenmoser and Baettig (1992), 
based on random observations of lynx gathered from 
1972 to 1987, revealed the discontinuous distribution 
of the lynx in the Jura Mountains and a lack of obser- 
vations in the central part of the range (Figs. 58.1, 
58.2). Our model confirms that the central part, espe- 
cially for males, seems to be a suboptimal habitat (Fig. 
58.2). So far, the AUC values from the evaluation as 
well as the anecdotal observations of dispersing 
subadult lynx seem to confirm the validity of the 
model for the Jura Mountains. None of the subadults 
settled in the central part of the Jura Mountains but 
continued on to adjacent areas (Figs. 58.1, 58.2). All 
subadult lynx dispersed through corridors predicted 
by the model. The final test for our model, however, 
will be the future spread of the lynx population 
through the northern part of the Swiss Jura Moun- 
tains. The model can predict the potential distribution 
of the lynx in the Jura Mountains and, when based on 
knowledge of the land tenure system of the resident 
lynx (Breitenmoser et al. 1993), allows estimation of 
the possible population size. Such knowledge will be 


crucial for the conservation and management of this 
large carnivore species living in such close proximity 
to intensive human activities. Since large-carnivore 
populations are difficult to census over vast areas, a 
modeling approach based on high-quality, local data 
from telemetry may be more efficient. Decisions will 
have to be made in regard to the choice of the model 
and the threshold value. We prefer the model built 
from presence/absence data of both sexes, because this 
had the best fit (Table 58.2) and represents best the 
need of the population as a whole. 

In conservation-oriented models, the overestima- 
tion of false-positive locations (the model predicts 
presence of a species when in fact it is absent) versus 
the overestimation of false negative locations includes 
different conservation risks (see also Fielding, Chapter 
21). The balance between false positives and false neg- 
atives is defined through the threshold value and must 
be set according to the question to be answered. The 
lower the threshold value, the higher the percentage 
of all quadrats containing lynx fixes included, but also 
the higher the share of quadrats without any 
locations. 

Another practical use of the model will be the eval- 
uation of potential connections of the Jura population 
with neighboring lynx populations in the Alps or.in 
the Vosges Mountains. For this purpose, however, we 
will have to expand our model into France and test its 
capability to predict corridors or barriers. 
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PART 5 
Predicting Species: 
Populations and 
Productivity 
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PART 5 


Mapping a Chimera? 


Edward O. Garton 


an we map abundance and productivity of a popu- 

lation from ecological-habitat maps of characteris- 
tics such as soil type, canopy cover, land use, and land- 
type morphology? Unfortunately, populations of 
animals are dynamic, changing, and even chimeric. 
Presence-absence maps oversimplify complex patterns 
of continually expanding and shrinking distributions. 
Perhaps a better approach would be to use our ecologi- 
cal maps as the basis for sampling to estimate popula- 
tion characteristics. The most extensive effort to esti- 
mate abundance and production for any population in 
the world is the continental waterfowl survey con- 
ducted annually by the U.S. Fish and Wildlife Service 
and Canadian Wildlife Service (1987). Biologists from 
federal, state, and provincial resource management 
agencies sample wetlands using aerial and ground 
counts of waterfowl on more than 3.6 million square 
kilometers of breeding habitat in Canada and the 
United States. These sample counts produce annual esti- 
mates of population size and production of various 
species of ducks (Smith 1995). This approach uses 
maps of wetlands to delineate geographic units of po- 
tential habitat and then sample the geographic units 
within species distributions. Finite sampling methods 
(Hayek and Buzas 1997) are applied to these samples to 
estimate the characteristics for the entire population. 
Another approach is a hybrid strategy that combines 
current sampling data with previous studies and ecolog- 
ical maps to model population characteristics. This 


modeling approach is the most powerful approach to 
predicting population characteristics. A majority of 
chapters in this section take this modeling approach to 
predicting species-population characteristics. 

Whether we choose to map, sample, or model pop- 
ulation characteristics of species, we must immediately 
confront the issue of variability. Can we treat popula- 
tion characteristics as constant and make deterministic 
predictions, or must we take a probabilistic approach? 
Assuming constancy of populations is conceptually 
simple but unrealistic, and it makes validation with 
new data difficult. A better approach would be to em- 
brace the variability by predicting probabilistically. 
There are two principal types of variability that we 
must address: (1) variation through time within spa- 
tial units, and (2) spatial variability among units. 


Temporal Variation within Spatial Units 


Populations within individual spatial units can vary 
substantially, both seasonally and yearly, due to ge- 
netic, demographic, and environmental variability 
(Shaffer 1981). Census data for such small popula- 
tions are best described with means and variances of 
the characteristics such as abundance, survival rate, 
annual rate of change, and so forth, or by the trend or 
periodicity for populations showing long-term trends 
or cycles. At larger spatial extents, populations or 
metapopulations typically show less variation through 
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time because genetic, demographic, and environmen- 
tal variation of the subpopulations are uncorrelated or 
weakly correlated with each other over larger areas. 
These independent random deviations tend to cancel 
each other out over larger areas. 

Variation through time threatens the long-term per- 
sistence of small populations, leading to a major focus 
on population viability analysis (PVA; Soulé 1987a,b). 
Roloff and Haufler (Chapter 60) propose a habitat- 
based PVA that hierarchically structures population 
modeling from individual to subpopulation to popula- 
tion level. They use this habitat-based PVA to predict 
population numbers, distributions, and probability of 
persistence for white-headed woodpeckers (Picoides 
albolarvatus) in an area subject to silvicultural treat- 
ments. In a similar way, Raphael and Holthausen 
(Chapter 62) model the effects of habitat management 
using a spatially explicit population model for the 
northern spotted owl (Strix occidentalis caurina). 
They used locally collected demographic information 
to model the results of proposed management choices, 
and, in the process, found several approaches superior 
to those planned originally by managers. 


Variability between Spatial Units 


Some of the differences between habitat patches (or 
larger-scale spatial units) in natural systems are obvi- 
ous, such as size, shape, resource productivity, amount 
of edge, and distance to adjacent habitat patches. 
Other differences are more subtle, such as influence of 
adjacent matrices, abundance of predators and com- 
petitors, and connectivity to nearby source habitats. 
The only realistic approach possible is to estimate, 
through sampling or modeling, the mean and variance 
of characteristics of populations occupying particular 
spatial units. Sisk, Noon, and Hampton (Chapter 63) 
demonstrated the power of an effective area model 
that predicted spatially explicit animal density as a 
function of maps of habitat type, adjacent habitat, and 
distance from edge. This was done in a probabilistic 
manner, generating maps of animal abundance with 
an associated measure of certainty. Hunsaker, et al. 
(Chapter 61) utilized the variability between popula- 
tions of California spotted owls to identify habitat 


characteristics that were statistically different for suc- 
cessful, productive pairs. 


Hierarchy of Spatial Units 


[71 


The classic definition of a population as *a group of 
organisms of the same species occupying a particular 
space at a particular time" (Krebs 1994:151) is so gen- 
eral that it could describe everything from a single 
deme occupying a single patch of habitat to a 
metapopulation distributed across great regions con- 
taining enormous areas of unoccupied nonhabit in ad- 
dition to areas of high-quality habitat occupied by 
dense concentrations of highly productive individuals. 
A better approach would be to delineate a series of hi- 
erarchical spatial units containing groupings of indi- 
viduals significant for our understanding, estimation, 
and prediction of future population conditions. I sug- 
gest delineating five levels of spatial aggregation on 
the basis of demography, movement, geography, and 
genetics (see Fig. I5.1 in color section) as follows: 


1. Deme: The smallest grouping of individuals show- 
ing random breeding (within the constraints of the 
species’ social system) where it is reasonable to es- 
timate birth, death, immigration, and emigration 
rates. Animals in this grouping are ideally distrib- 
uted continuously in one patch of habitat, and 
their movements within this patch of habitat are 
restricted to home ranges for breeders during the 
breeding season. The size of this patch ideally 
would be related to the dispersal distance of juve- 
niles or perhaps equal an area twenty to fifty times 
the size of a home range. For example, red-winged 
blackbirds (Agelaius phoeniceus) occupy territories 
variable in size but averaging 0.05 hectare for 
males in marsh habitat (n = 868 territories) and 0.3 
hectare for males in upland habitat (n = 97 territo- 
ries, Beletsky 1996:182), with each male territory 
holder guarding a harem averaging 3.3 females 
(n = 2389, Beletsky 1996:136). Males disperse an 
average distance of 1.4 kilometers from their natal 
nest to their first breeding territory and females dis- - 
perse 1 kilometer on core marshes (Beletsky 
1996:28). This suggests that demes of this species 
may cover only 3-5 square kilometers in areas such 
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as Columbia National Wildlife Refuge. This size 
area constituted Beletsky and Orians’ (1996) core 
study area, and supported seventy to eighty male 
redwing territories each year (Beletsky 1996:13) 
and three times that many breeding females. 

. Population: A collection of demes with strong con- 
nections demographically (very high correlations), 
genetically, and through frequent dispersal. The 
population occupies a collection of patches of 
habitat, without large areas (relative to dispersal 
distance) of nonhabitat intervening. The area is 
typically less than one hundred times the size of an 
average home range and not larger than the disper- 
sal distance of 95 percent of initial dispersers, but 
it may be much larger if the habitat patches are lin- 
ear in shape or widely dispersed. For example, the 
maximum known dispersal distance for red-winged 
blackbirds from Beletsky and Orians' (1996) study 
at Columbia National Wildlife Refuge was 7 kilo- 
meters for males and 2.8 kilometers for females 
(Beletsky 1996:13). Delineating a redwing popula- 
tion to include all the patches of marsh habitat 
within 5 kilometers of their 3-square-kilometer 
study area would identify a population of approxi- 
mately seven hundred male territory holders (Belet- 
sky and Orians 1996:152) and 2,700 accompany- 
ing females occupying an area of patchy marsh 
habitat bordering lakes and streams distributed 
throughout a rolling desert grass/shrubland of 150 
square kilometers (Fig. I5.1). 

. Metapopulation: A collection of populations suffi- 
ciently close together that dispersing individuals 
from source populations readily colonize empty 
habitat resulting from local population extinction. 
Populations within a single metapopulation may or 
may not show correlations in demographic rates 
but the low rates of dispersal are sufficient to main- 
tain substantial genetic similarity. For example, 
red-winged blackbird populations distributed 
among the seven national wildlife refuges along 
200 kilometers of the Columbia River in the south- 
central part of the state of Washington may consti- 
tute a metapopulation (Fig. I5.1). 

. Subspecies: A collection of metapopulations in a 
geographic region where very rare dispersals main- 
tain genetic similarity but populations and meta- 


populations occupy habitat patches that may be 
separated by enormous distances or by large areas 
of nonhabitat resulting in substantial demographic 
independence among  metapopulations. The 
metapopulations of red-winged blackbirds occupy- 
ing the. intermountain region east of the Sierra 
Nevada and Cascade Mountains and west of the 
Rocky Mountains in southern British Columbia, 
eastern Washington, eastern Oregon, eastern Cali- 
fornia, and in Idaho and Nevada are together cate- 
gorized as the subspecies A. phoeniceus nevadensis 
(Beletsky 1996:21). 

5. Species: The collection of subspecies encompassing 
the entire distribution and geographic range of the 
species. The species may encompass substantial dif- 
ferences in phenotypes (habitat, physiology, behav- 
ior) and genotypes. For example, red-winged 
blackbirds breed extensively across North and 
Central America from east-central Alaska and the 
Yukon to Costa Rica and Cuba (Fig. 15.1). The 
twenty-two recognized subspecies vary in size, 
shape, and plumage, yet numerous genetic studies 
have repeatedly observed a remarkably high degree 
of genetic similarity and a lack of genetic differen- 
tiation among subspecies (Beletsky 1996:22). Dol- 
beer (1982) documented numerous movements be- 
tween winter roosts by banded redwings of 
hundreds or thousands of kilometers, suggesting 
that even a small amount of straying from breeding 
areas due to males or females following flock mem- 
bers from winter roosts would lead to frequent ge- 
netic exchange among populations spread widely 
throughout the species distribution. 


Theobald and Hobbs (Chapter 59) outlined an ap- 
proach to habitat delineation, based on habitat qual- 
ity, that incorporates functional relationships of 
species to resources, environmental factors (e.g., eleva- 


tion, aspect), and disturbance. They propose delin- 


eation at three of the scales identified above based on 
allometric relationships: individual scale based on for- 
aging behavior, population scale based on minimum 
viable populations, and metapopulation scale depend- 
ent on dispersal. They argue that if such an approach 
to delineation of spatial aggregates proves successful 
for specific animal species, it would provide a very 
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useful tool for identifying appropriate patches for pur- 
poses of prediction and estimation. 

Delineation of spatial units appropriate for each of 
the five levels in this hierarchy would lead to dramatic 
improvements in our understanding of the processes 
driving population changes and in our ability to pre- 
dict the consequences of management activities. Ef- 
forts to increase our understanding of population 
processes beyond those operating primarily on single 


isolated populations in independent patches of habitat 
depend upon successfully viewing populations from 
this hierarchical perspective. Without this perspective, 
we may fail to understand the causes and conse- 
quences of interactions between populations and in- 
corporate them into our thinking. Further advances in 
understanding the processes determining population 
change depend critically upon this next step. Other- 
wise, we may simply be trying to map a chimera. 


CHAPTER 


59 


Functional Definition of Landscape Structure 
Using a Gradient-based Approach 


David M. Theobald and N. Thompson Hobbs 


P.. gaii approach to conceptualizing landscape 
patterns requires distinguishing between patches 
and the matrix surrounding them (Dramstad et al. 
1996; Forman and Godron 1986; Fig. 59.1). This dis- 
tinction is rooted in island biogeography theory 
(MacArthur and Wilson 1967), where patches are 
metaphors for islands and the matrix is an inhos- 
pitable “sea” (Wiens 1995; Wiens 1996b). The ap- 
proach is straightforward and accessible because it 
usually corresponds to the scale of human perception 
of the landscape and resembles traditional carto- 
graphic representations of landscapes using categori- 
cal maps (Gustafson 1998). Furthermore, analysis of 
these landscape maps can be accomplished easily 
through standard geographic information systems 
(GIS) methods (e.g., Haines-Young et al. 1993). As a 
result, the patch/matrix (hereafter PM) approach has 
been a dominant approach to quantifying landscape 
structure to date. 

However, we argue here that the PM approach is 
limited because it cannot easily incorporate functional 
characteristics of species or ecological processes in 
representations of landscape structure. In particular, 
biological mechanisms that operate on a landscape are 
often poorly represented by PM models. An important 
challenge for the field of landscape ecology is to move 
beyond simple representations of pattern that fail to 


reveal the consequences of pattern for ecological 
processes (Wiens 1999). 

For example, the basis for most studies in landscape 
ecology is a land-cover or vegetation map. Patches of 
vegetation are commonly represented by polygons in a 
categorical map. Such maps emerge from photointer- 
pretation of aerial photography or aggregation of adja- 
cent (either four or eight) cells from a classified re- 
motely sensed image, and often patches are not defined 
consistently or are not biologically based (Paton 1994). 
Representing a patch as a discrete entity (e.g., a poly- 
gon) ignores both the fuzziness of the patch boundary 
and heterogeneity within the patch (Gustafson 1998). 
Noncontiguous patches of vegetation can be function- 
ally integrated if a species or process of interest oper- 
ates at a scale that can span patches (With and Crist 
1995; Hobbs 1999). Uncertainties associated with veg- 
etation data are seldom reflected in habitat maps mod- 
eled from vegetation and other elements (Flather et al. 
1997), even though all maps have some level of inaccu- 
racy (Goodchild and Gopal 1989) and there are well- 
established methods of error assessment in land-cover 
mapping (e.g., Congalton 1991). 

An important recent finding demonstrates that mod- 
els that use the core area of a patch, rather than the en- 
tire patch, are better predictors of species presence/ab- 
sence for species that avoid patch edges and utilize 
patch interiors (Temple 1986a). The reduction in habi- 
tat quality at patch edges is caused by processes such as 
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edge *— matrix 


Figure 59.1. The standard approach to conceptualizing land- 
scape structure distinguishes among patches, core areas, 
edge areas, and the matrix. 


changes in microclimate regimes (Chen et al. 1995; 
Stevens and Husband 1998) and increased predation 
rates (Paton 1994). Although the depth of this edge ef- 
fect is typically modeled using a uniform distance (e.g., 
Reed et al. 1996), processes that create edge are known 
to vary considerably, for example by aspect and edge 
contrast (Chen et al. 1995). Methods have been devel- 
oped to vary the edge distance (e.g., Baskent and Jor- 
dan 1995) but these techniques are rarely used in land- 
scape analysis. Moreover, edges are typically modeled 
as a step function, where all habitat value within the 
edge-effect distance is lost, despite the knowledge that 
for most processes the magnitude of the edge effect de- 
creases with increasing distance from the edge (e.g., 
Chen et al. 1992; Chen et al. 1995; Paton 1994), 

Patches are typically considered to be imbedded in 
a matrix of an inhospitable intervening landscape, and 
this matrix is generally considered biologically inert or 
even hostile. As a result, isolation of patches are 
thought of simply in terms of Euclidean distance be- 
tween patches (e.g., Schumaker 1996; Keitt et al. 
1997). However, the matrix is not ecologically homo- 
geneous and movement through the landscape is af- 
fected, not only by the probability of encountering 
patch edges but also by how species perceive and re- 
spond to heterogeneity within the matrix (Wiens et al. 
1993). Matrix properties such as edge contrast, vege- 
tation structure, and land use clearly influence species 
movement (Fagan et al. 1999). 

Although the limitations of the PM approach are 


generally recognized, they are often overlooked be- 
cause there are no widely available alternatives for 
modeling landscape structure. We assert that it will re- 
main difficult to demonstrate linkages between land- 
scape structure and ecological processes until alterna- 
tive approaches are developed that explicitly include 
understanding of process in the representation of 
habitat pattern. 

In this chapter, we describe an approach that devel- 
ops two innovations beyond the typical PM approach. 
First, although the typical approach maps a species' 
habitat, or the physical space within which a species 
lives, we recognize that landscapes include a full range 
of habitat quality that better describes the ability of an 
area to provide the appropriate conditions. Second, 
we define patches on a functional basis by identifying 
components of landscape structure by scaling to eco- 
logical distances. We define patches at three levels of 
organization: individual, population, and metapopula- 
tion. Each of these levels is defined by a corresponding 
process: daily foraging, movement within home 
ranges, and dispersal between habitat. We represent 
these processes with simple allometric models that 
scale model parameters to body mass. We then use 
these models to develop functional representations of 
landscape structure. After developing this algorithm, 
we illustrate the differences between this gradient- 
based approach and the typical PM approach to land- 
scape characterization. A broader framework that ex- 
plicitly incorporates biological knowledge is needed to 
better understand the consequences of changes in 
landscape structure on species distributions and to un- 
derstand the consequences of land-use change on pop- 
ulation viability and distribution. 


Methodology 


Our primary goal is to develop an approach to defin- 
ing landscape structure in functional terms. To do so, 
we first develop a model that quantifies habitat qual- 
ity. We then identify patches of habitat based on scal- 
ing of the behavior of a species at three levels of 
organization: individual, population, and metapopula- 
tion. A major challenge in modeling landscape struc- 
ture is to better incorporate biological mechanisms, 
yet not exceed our ability to parameterize a model nor 
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overwhelm the interpreter with overly complex mod- 
eling results. We chose to use allometric relationships 
to illustrate our functional approach (Peters 1987), 
because these relationships provide a useful way to pa- 
rameterize models for a broad range of species with a 
minimum of detail. Because differences in life history 
characteristics of species are important, the allometric 
relationships should only be used in a general way and 
with an understanding of the error terms of the allo- 
metric relationship. Modeling habitat for a specific 
species requires detailed understanding of the species 
and careful parameterization of the model. 


Habitat Quality 


We identify areas that provide resources for a given 
species. Ideally, resources are defined in terms of the 
individual elements (e.g., flora, fauna, soil, water, etc.) 
used by a species at a given location (Morrison and 
Hall, Chapter 2). However, resources are most com- 
monly defined in terms of overstory vegetation or 
land-cover types. Typically, this is done through a 
species-vegetation affinity table that contains the vege- 
tation types utilized by a species (e.g., Caicco et al. 
1995; Edwards et al. 1996; White et al. 1997a). We 
extend this approach by allowing the affinity value to 
range from 0.0 to 1.0, describing the strength of the 
association. If an error matrix has been built for the 
vegetation image, uncertainty can be incorporated 
into the habitat-quality model by multiplying the 
affinity score by the probability that a location 
mapped as a particular type is correct. Fuzzy logic can 
be used to further incorporate fine-scale information 
into distribution maps by computing the degree of 
membership in each vegetation type for each location 
(pixel) on a map (e.g., Hill and Binford, Chapter 7). 
Other factors such as stand structure, canopy cover, 
and soil type can be incorporated to adjust estimates 
of resource availability. Although there are practical 
limitations to resource data, landscape analysts should 
strive to narrow the gap between the ideal definition 
of resources and the limitations of available data. 
Second, we identify environmental gradients that 
modify a species' ability to utilize resources, such as ele- 
vation, slope/aspect, precipitation, or solar exposure. 
These environmental gradients are used to adjust the 
predicted habitat for a species. Typically, environmental 


gradients are represented by a binary map that de- 
scribes the limits of a range, but we extend the values to 
range from 0.0 to 1.0. For example, species distribu- 
tions are usually limited by upper and lower elevation 
ranges, yet these limits create artificial boundaries. 

Third, the quality of habitat resources for certain 
species can be modified by either in situ or nearby nat- 
ural or human-caused disturbance. For example, some 
species have lower population densities in areas adja- 
cent to land-cover conversions such as forest clearcuts 
or urban areas (Harrison 1997). Typically, habitat loss 
caused by in situ impacts on habitat is represented by 
a change in land-cover type (e.g., forested to urban 
land conversion) at a particular location, whereas 
habitat loss from nearby land-cover types is modeled 
by removing the patch edge, leaving the patch core 
(Temple 1986a). However, we represent both in situ 
and nearby modifications to habitat quality by devel- 
oping a relationship between the disturbance (e.g., 
road or housing density) and the impact on habitat 
use (e.g., Theobald et al. 1997). This allows us to 
model the reduction of habitat quality caused by in 
situ impacts that may not be captured by typical land- 
cover maps. For example, habitat quality can be re- 
duced by human activities in suburban and rural areas 
(Harrison 1997), yet these locations are not repre- 
sented by urban land-cover types. Typically, edge ef- 
fects reduce patch quality near the edge, but we ex- 
tend our approach to allow both lowered and 
increased habitat quality as a function of distance to 
edge. For interior species, habitat quality near the 
patch edge can be lowered, but for edge species, patch 
core areas may be poorer habitat than at the edge. 

Thus, we define (Q) as an index of habitat quality, 
which is based on surrogate measures of habitat qual- 
ity. O is measured in terms of area (e.g., hectares) and 
is a function of three components (see Fig. 59.2 in 
color section): 


O z f(R, S, D) (59.1) 


where resources, environmental factors, and distur- 
bances are denoted by R, S, and D, respectively. 


Individual 


Within areas smaller than a home range, movement is 
characterized by foraging, when species are maximizing 
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their energetic return through foraging behavior 
(Hobbs 1999). Species integrate resources over a local 
area through foraging and so definition of patches 
should reflect this scale of behavior (Addicott et al. 
1987). 

We define patches at the individual level based on 
home-range requirements. The area of home range (J) 
is based on an allometric relationship to body size (B) 
(Eq. 59.2) (Harestad and Bunnell 1979): 


7—1:166M195 where S, = 1.12 (59:2) 


where M is body mass in kilograms and I is in square 
kilometers. Next, we derive the radius required (Rj) to 
fulfill the area requirement using Equation 59.3. For 
every cell in the habitat map, we calculate the average 
index of habitat quality within the radius Rj using 
Equation 59.4 (see Fig. 59.3 in color section). 


R; = (1/3.1415)12 (283) 


P; = focalmean(Q, circle, Ry ) > t (59.4) 


The resulting map depicts a gradient of values, or 
the proportion of a cell that contributes to habitat. To 
identify contiguous areas of habitat adequate for an 
individual, we threshold the gradient values, where 
the average quality P; exceeds a user-specified effi- 
ciency threshold (t). 
threshold value depends on the behavior of the species 
or process in consideration. We then aggregate adja- 
cent (eight neighborhood) cells to form patches that 
contribute to individual’s habitat. 


Ultimately, the appropriate 


Population 


Population patches are identified in a similar fashion 
as the individual level, but we identify the area re- 
quired to support minimum viable population patches 
based on allometric scaling from minimal mammal 
densities (Silva and Downing 1994) (Eqs. 59.5 and 
59.6): 


D = e(-0.68*In(M) + 2.1414) (59.5) 


HESTAD (59.6) 


where D is the minimum viable density in animals per 
square kilometer and P is the area required to support 
A animals. The population patch size based on mini- 
mal mammal deñsities is sensitive to uncertainty and 


variability in population densities (Bowers and Matter 
1997; Van Horne et al. 1997). 


Metapopulation 


A third level of organization is the metapopulation 
level. At this spatial extent, species response to land- 
scape structure is characterized by dispersal. Here, the 
goal is to identify clusters of population patches that 
are within the dispersal distance of one another. Con- 
versely, population patches are considered isolated 
when they are beyond the dispersal distance. The typ- 
ical approach is to consider patch isolation in terms of 
the Euclidean distance from one patch to another 
(e.g., Keitt et al. 1997). However, dispersal is affected 
not only by the location of habitat patches, but also 
by the characteristics of the intervening patches that 
make up the matrix. 

The relative ease or difficulty a species has in mov- 
ing through the matrix is largely influenced by the 
land use/cover type at any given location. We concep- 
tualize the matrix in terms of impedance to movement 
and represent it as a cost surface. This allows Euclid- 
ean distance to be modified by the relative ease or dif- 
ficulty to travel from one location to another. Also, 
human land-use and natural-disturbance regimes in 
the matrix interact with land- use/cover and can fur- 
ther restrict dispersal. The relative resistance to move- 
ment needs to be specified for each land use/cover 
type. 

Again, we use allometric scaling to parameterize 
dispersal ability based on body size: 


B = 0.001M-0.91 (59.7) 


L -InZ/-p (59.8) 


where D is the probability of successful dispersal (D. 
Malkinson and N. T. Hobbs unpublished data), and L 
is the dispersal distance (m) for a probability of Z. We 
then calculate the distance based on matrix quality 
using a cost-distance function, where the population 
patches are the source patches for the cost-distance 
function. Typically, we reclassify land use/cover maps 
to reflect how the cover types would impede the move- 
ment through that area (see Fig. 59.4 in color section). 
The impedance values could also reflect other data 
layers, such as road or housing density, and very high 
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TABLE 59.1. 


Parameters for a range of body sizes based on allometric relationships for mammals. 


Individual patch 


Population patch 


Meta-population 
dispersal 


Body Area Radius No. Area Radius No. Distance (m) No. 
size (kg) (km?) (m) patches (km?) (m) patches atp=0.5 patches 
T AL ALY 609 481 ALL) 1,934 119 693 113 
10 13.39 2,064 89 56.23 4:231 33 51694. 16 
50 73.72 4,844 31 168.00 TuS 27 24,372 S 
values could cause a “barrier” to movement. Popula- . Conclusions 


tion patches are then grouped if they are within the 
distance L. 

We illustrate our methodology by examining individ- 
ual, population, and metapopulation patches defined 
for a representative range of mammals in Colorado 
(Table 59.1). We use the vegetation map from the Col- 
orado Gap Analysis Project, which was produced by in- 
terpretation (using a 100-hectare minimum mapping 
unit) of a Landsat Thematic Mapper image (30-meter 
resolution). We identified forested-type habitats (includ- 
ing coniferous, deciduous, and mixed). We then found 
the individual and population patches for mammals of 
1, 10, and 50 kilograms in size and reclassified the land- 
use/cover map to reflect matrix quality and its influence 
on the interpopulation-patch movement. 


Results 


Over 568,600 square kilometers of habitat are defined 
using the typical binary approach, and there are over 
5,900 patches. As is typical of habitat maps produced 
using the binary approach, the distribution of patch 
sizes is highly skewed. For instance, over 45 percent of 
the patches are two cells or less in size (8 hectares), 
and 67 percent of patches are less than ten cells in 
size. The average patch size is 45 square kilometers. 
In contrast, the habitat maps produced using the 
functional algorithm have fewer patches (Table 59.1), 
and patches are more evenly distributed by size. The 
smallest average patch size is 166 square kilometers 
for individual population patches for 1-kilogram body 
size. The number of patches decreases rapidly (nonlin- 


early) with body size. 


Assessments that examine the consequences of devel- 
opment and land-use change for habitat quality and 
fragmentation should be based on analyses that ex- 
plicitly incorporate the functional response of a 
species or process. Without incorporating these re- 
sponses, evaluations of landscape change quantify 
structural changes that are simply an artifact of the 
data and interpretation scale. This requires biological 
parameters to be incorporated, and we find that para- 
meterization of these models using allometric scaling 
of body size for three levels of organization is a useful 
way to incorporate biological realism into modeling 
habitat fragmentation. 

We do not contend that our approach is necessarily 
more accurate than other modeling approaches. In- 
deed, testing predicted versus observed patch occu- 
pancy is fraught with its own challenges (Fielding, 
Chapter 21, Karl et al., Chapter 51), including the dif- 
ficulty of understanding errors of commission (i.e., in- 
terpreting predictions of occupied habitat versus the 
lack of field data that demonstrates unoccupied habi- 
tat). However, we do believe that the approach we 
offer is useful because it provides results that are more 
easily interpreted than results from traditional meth- 
ods. That is, the linkage between the assumptions 
made about the mechanisms affecting habitat use and 
the resulting landscape pattern is explicit, ecologically 
based, and repeatable. At minimum, it provides a 
starting point for managers to understand how indi- 
vidual mechanisms might contribute to habitat loss 
and/or fragmentation. 

We see three immediate applications of this ap- 
proach for wildlife managers. First, this methodology 
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has been used to develop refined maps of potentially 
suitable habitat for given species of interest. These re- 
fined maps are more defensible because they better re- 
flect known life-history characteristics and can better 
incorporate data uncertainty. Second, the potential 
consequences of management actions or of changes in 
landscape context on local-scale conservation sites 
could be assessed using this approach. For example, 
would a significant loss of habitat occur if a road were 
built through a conservation site? What if five roads 


were developed through a site? Third, this approach 
offers a way to screen, at ecoregional scales, poten- 
tially imperiled species by identifying critical thresh- 
olds and characteristics landscape patterns that result 
from known or predicted landscape changes. For ex- 
ample, the consequences of urban growth on sensitive 
species could be assessed by comparing habitat maps 
that reflect the extent and intensity of urban and sub- 
urban development at two time periods (e.g., 1990 
versus 2020). 


CHAPTER 


60 


Modeling Habitat-based Viability from 
Organism to Population — 


Gary J. Roloff and Jonathan B. Haufler 


eo viability is a primary management 
issue in the United States as directed by the En- 
dangered Species Act (ESA) and National Forest 
Management Act (NFMA). An underlying premise of 
the ESA is to ensure that distinct population seg- 
ments receive appropriate levels of protection and 
that their persistence is ensured into the future (En- 
dangered Species Act of 1973). Similarly, the NFMA 
(and pursuant regulations) requires the United States 
Forest Service to “maintain viable populations of all 
native vertebrate species.” The NFMA does not re- 
quire the Forest Service to identify minimum viable 
numbers per se, but rather to ensure that species per- 
sist over time (Marcot and Holthausen 1987). Popu- 
lation viability analysis (PVA) has most often been 
used to portray the effects of genetic, demographic, 
environmental, and catastrophic stochasticities on 
long-term population trends (Table 60.1). However, 
one of the greatest challenges facing resource man- 
agers on lands throughout the United States is how 
to evaluate the effects of individual management 
projects on population viability and biodiversity (see 
Cogan, Chapter 18), especially over larger areas 
(e.g., national forests, species’ range). To this end, we 
developed a modeling framework that permits evalu- 
ation of operational-level projects in the context of 
overall population viability. 

In this chapter, we discuss population viability as 


a planning goal, review the role that habitat re- 
sources (e.g., vegetation, patch configuration; see 
Morrison and Hall, Chapter 2) have played in viabil- 
ity analyses, review a process for incorporating habi- 
tat resources into spatially explicit viability assess- 
ments, and demonstrate use of this mechanism for 
three species. In this chapter, a population is defined 
as consisting of all individual locations (Morrison 
and Hall, Chapter 2) and subpopulations of con- 
specifics that are demographically, genetically, or 
spatially disjunct (Wells and Richmond 1995). We 
refer to a subpopulation as a set of individuals that 
are not spatially isolated from other individuals 
(Wells and Richmond 1995), synonymous to Small- 
wood's (Chapter 6) “constrained aggregations.” Our 
example explicitly addresses three components of 
population viability analyses: demographic stochas- 
ticity, environmental stochasticity, and population 
spatial structure. Demographic stochasticity is the 
variation in birth and death rates observed from an 
independent sample of individuals that make up a 
population (Miller and Lacy 1999). Environmental 
stochasticity is variation in the population mean it- 
self (Miller and Lacy 1999). Population structure 
refers to the spatial organization of organisms and 
subpopulations and how those entities interact. Our 
discussion emphasizes PVA on terrestrial fauna (ex- 


cluding insects). 
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Population Viability As a Planning Goal 


In the NFMA, a viable population is defined as *one 
which has the estimated numbers and distribution of 
- reproductive individuals to ensure its continued exis- 
tence is well distributed in the planning area" (Code 
of Federal Regulations, Title 36, Volume 2, Section 
219.19). The definition, consistent with other defini- 
tions (Soulé 1987a; Boyce 1992), implies that “contin- 
ued existence" is the management goal. PVA was de- 
veloped as a tool for assisting land managers in 
understanding and portraying the effects of change on 
this management goal; however, the number of species 
that must be considered under NFMA often make the 
use of PVAs restrictive in resource planning (Raphael 
and Marcot 1994). In addition, lack of information 
about species vital rates, movement capabilities, and 
habitat relationships often prohibit the rigorous appli- 
cation of PVA. Coarse-filter approaches have been ad- 
vocated to help alleviate this problem (Hunter 1990; 
Raphael and Marcot 1994; Haufler et al. 1996, 
1999b). These approaches assume that the proper dis- 
tribution and amounts of ecological communities in- 
herent to landscapes will directly relate to the viability 
requirements for most species. Recently, Haufler et al. 
(1996, 1999b) proposed a strategy for ecosystem 
management that used *adequate ecological represen- 
tation" as a coarse-filter planning goal to satisfy popu- 
lation viability. Adequate ecological representation 
refers to threshold amounts of ecological communities 
that are described and distributed according to poten- 
tial vegetation, existing vegetation successional stage, 
and historic disturbance regimes (Haufler et al. 1996, 
1999b). Adequate ecological representation is not ad- 
vocated as a management target; rather, it should be 
viewed as a threshold level that is never compromised. 
If adequate ecological representation is not compro- 
mised, then sufficient amounts of ecological communi- 
ties inherent to the landscape are provided, and it is 
assumed that organisms associated with these commu- 
nities are viable (Haufler et al. 1996, 1999b). Accord- 
ing to the coarse-filter strategy proposed by Haufler et 
al. (1996, 1999b), the viability assumption must be 
tested using an appropriate PVA for a select group of 
species that represent the diversity of ecological com- 
munities found in the planning landscape (Haufler et 


al. 1996, 1999b). They called this process a “coarse 
filter-fine filter approach to ecosystem management." 
As a means of fulfilling the PVA portion of Haufler et 
al.’s (1996, 1999b) approach, we developed a method 
for assessing viability (Roloff and Haufler 1997) that 
can be used to address viability goals in a variety of 
planning scenarios. 

The tangible components of the viability goal in 
NFMA include organism density and the spatial 
arrangement of those organisms. Thus, consistent 
with other efforts that linked demographic and envi- 
ronmental stochasticities, habitat, and metapopulation 
models in a PVA (e.g., Akcakaya et al. 1995; Root 
1998), our viability approach is based on these com- 
ponents. We developed a habitat-based framework for 
indexing organism density and for spatially portraying 
population structure (Roloff and Haufler 1997). In 
this approach, planning units are home ranges of vary- 
ing quality and sizes, information we use as input to 
PVAs. We define home range quality as the ability of 
an area traversed by an organism during a breeding 
cycle to provide conditions appropriate for individual 
persistence (Morrison and Hall, Chapter 2) and suc- 
cessful reproduction. In comparison to habitat quality 
(see Morrison and Hall, Chapter 2), home range qual- 
ity also explicitly defines the area of use and indexes 
the reproductive contribution of that area to popula- 
tion persistence. Our approach portrays the effects of 
resource conditions on population viability and offers 
resource planners a unit of measure (the contribution 
of individual home ranges) that can be mapped and 
linked to population viability planning through time 
(Roloff and Haufler 1997). 

In our approach, the basis of PVA is the quality and 
size of home ranges and their locations (Roloff and 
Haufler 1997). Numerous factors influence home 
range density and spatial arrangement in viability 
analyses, and these factors have been categorized as 
genetic, demographic, environmental, and cata- 
strophic (Shaffer 1981; Gilpin 1987). Ideally, PVA 
should include all of these factors (Gilpin and Soulé 
1986); however, complete data sets are rare, even for 
the most-studied wildlife species (Boyce 1992; Sæther 
et al. 1998). Thus, when faced with the daunting task 
of accounting for the viability of all species in a plan- 
ning area, the challenge in conducting PVAs is to 
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create a model that captures important aspects of a 
species’ ecology while only using realistically measura- 
ble parameters (Eberhardt 1987; Boyce 1992). Often, 
the most realistically measurable data available for 
population viability planning across large areas are 
habitat based. This conforms to the notion that loss or 
degradation of habitat is the most significant factor 
threatening the future extinction of species (Wilcox 
and Murphy 1985; Pimm and Gilpin 1989). 


A Perspective on Previous Viability 
Analyses and the Role of Habitat 


Except for situations in which population numbers are 
excessively small, the effects of demographic and envi- 
ronmental stochasticities on population vital rates 
have been identified as arguably the most important 
factors influencing population viability (Shaffer 1987; 
Lande 1988a,b; Boyce 1992; Wissel and Zaschke 
1994). Demographic and environmental stochasticities 
are incorporated into viability analyses by first deriv- 
ing the best estimates of individual and subpopulation 
vital rates and then simulating randomness around 
those rates based on the breadth of uncertainty sur- 
rounding the estimates. Most often, the portrayed 
vital rates involve birth and death processes (Table 
60.1). In population viability analysis, these stochastic 
events represent a variety of factors, including varia- 
tion in individual reproduction and survival rates, 
weather, prey and predator distributions, competition, 
and parasites and diseases (Hunter 1990; Caughley 
and Sinclair 1994). We concur that the randomization 
process is essential for portraying the complete range 
of responses that organisms may exhibit to chance 
events; however, the importance of accurate vital rate 
estimates should not be overlooked. To that end, we 
encourage the use of habitat as a means to refine vital 
rate estimation and to constrain the stochastic models. 

We reviewed population viability analyses that 
were published in the readily accessible literature to 
understand the conservation issues driving the analy- 
ses, the temporal scale of analyses, how habitat was 
incorporated, and the modeling approaches used 
(Table 60.1). We reviewed thirty-nine papers that de- 
scribed forty-one population viability analyses on 
thirty-five species (Table 60.1). Most (79 percent) 


PVAs were performed on species that were at least lo- 
cally rare or exhibited range reductions. In papers we 
reviewed, PVAs were most often conducted to evalu- 
ate the effects of management alternatives (Table 
60.1). Other common conservation issues addressed 
were the effects of demographic uncertainty and rein- 
troduction or translocations on species viability (Table 
60.1). PVAs were also conducted to evaluate reserve 
design (Lamberson et al. 1994; Goldingay and Poss- 
ingham 1995), the effects of life-history strategies on 
species viability (Elliott 1996; Vucetich et al. 1997), 
and to compare different modeling approaches (Mills 
et al. 1996). Some PVAs were used as exploratory data 
analysis to provide general insight on the factors influ- 
encing species’ persistence (Li and Li 1998; Saether et 
al. 1998). 

The utility of PVAs for resource planning and man- 
agement has been questioned. One problem with PVA 
is that it often depicts species viability over time 
frames beyond operational-level planning horizons 
(Boyce et al. 1994). In addition, the output of PVA (a 
probability of persistence for some time period) is dif- 
ficult to frame as a tangible management objective. In 
a typical natural resource planning scenario, strategic 
goals are established for fifty to hundred years and op- 
erational activities for achieving that goal are pro- 
jected in five- to ten-year increments. In an ideal situa- 
tion, progress toward strategic goals is reevaluated 
after the five- to ten-year time period and revised ac- 
cordingly. More realistically, strategic goals change on 
five- to ten-year intervals in response to changes in 
ecological, economic, or social influences, and knowl- 
edge and operational activities are adjusted accord- 
ingly. Most viability assessments reviewed for this 
study projected population dynamics for one hundred 
years or more (Table 60.1) conforming to the notion 
that longer time frames are required to portray popu- 
lation trajectories. Thus, although the temporal scales 
of strategic planning and PVA are theoretically com- 
parable, in reality the modeling requirements of PVA 
dictate that individual operational-level activities get 
“lost” in the stochastic simulations. 

The mismatch between PVA output and opera- 
tional-level decision making should not preclude the 
use of PVA in planning. Rather, PVA is a tool that can 
be used to assess the long-term cumulative effects of 
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operational-level management activities on species via- 
bility. This view is consistent with Thomas et al. 
(1990), who recommended the use of PVA for extrap- 
olating long-term population sizes that are used to 
frame management objectives. Thus, the challenge of 
using PVA for operational-level decision making lies in 
our ability to align the temporal and spatial scales 
used for both processes. This can be accomplished by 
using home ranges as the planning unit, if the home 
ranges are understood in the context of population 
dispersion and regional pattern of distribution (Small- 
wood, Chapter 6). 

Frameworks for integrating habitat quality, quan- 
tity, and spatial arrangement into PVA have been 
demonstrated (e.g., Akcakaya et al. 1995; Linden- 
mayer and Lacy 1995a; Lindenmayer and Possingham 
1996; Root 1998; Forys and Humphrey 1999). Of the 
thirty-nine papers reviewed for this study, twenty-six 
(67 percent) either implied habitat effects via stochas- 
tic modeling or did not model habitat. Of the thirteen 
(33 percent) papers that explicitly incorporated habi- 
tat quality (i.e., some differentiation between resource 
areas as to their fitness contribution, see Morrison and 
Hall, Chapter 2) and location into the PVA, nine (23 
percent) explicitly modeled habitat patches (Table 
60.1). These PVAs used habitat to estimate carrying 
capacity (Forys and Humphrey 1999), model different 
demographics (Foin and Brenchley-Jackson 1991; 
Akcakaya et al. 1995; Root 1998), and assess 
metapopulation structure (Lamberson et al. 1994; 
Akçakaya et al. 1995; Lindenmayer and Lacy 1995a; 
Lindenmayer and Possingham 1996). The findings 
from our review of PVA models are consistent with 
Beissinger and Westphal (1998), who noted that rela- 
tively few modeling approaches incorporated habitat- 
(or patch) specific information. 


Using Habitat and Home Ranges 
for Viability Analyses 


Reasons for species peril are varied (Table 60.1), al- 
though habitat loss and fragmentation are frequently 
cited as primary contributors to population decline. 
The ESA and NFMA viability guidelines explicitly 
identify habitat as a risk or limiting factor important 
for most species managed under these regulations. For 


example, Temple (1986b) demonstrated that habitat 
loss was responsible for 82 percent of the avian taxa 
endangered by extinction. Although Boyce (1992) rec- 
ommended that viability analyses include genetic, 
demographic, environmental, and catastrophic sto- 
chasticities, Foin and Brenchley-Jackson (1991) 
demonstrated that useful estimates of population dy- 
namics could be derived from simple, habitat-based 
models. However, describing habitat is by no means a 
simple task. Habitat is defined as the physical space 
within which an organism lives (Morrison and Hall, 
Chapter 2) and as such can include a wide range of 
environmental resources that often form complex in- 
teractions (Belovsky 1987; Hunter 1990; Caughley 
and Sinclair 1994). Thus, even though the effects of 
habitat on demographic and environmental stochastic- 
ities have been identified as an important aspect of 
population viability, the question remains as to what 
expression of habitat should be quantified in viability 
planning. 

By definition, habitat quality is the suite of re- 
sources and environmental conditions that determine 
the presence, survival, and reproduction of a popula- 
tion (Morrison and Hall, Chapter 2), and thus it 
makes sense to use habitat quality in viability assess- 
ments. [n using habitat quality as a framework for 
PVA, one assumes that home ranges or habitat patches 
of similar quality exhibit more-predictable demo- 
graphics and are exposed to similar levels of environ- 
mental variation (e.g., rates of predation, forage suc- 
cess; Akcakaya et al. 1995; Beissinger 1995). For 
example, Root (1998) showed that Florida scrub-jays 
(Aphelocoma coerulescens) in different vegetation 
patches exhibited different probabilities for survival 
and breeding. She attributed these differences in vital 
rates in part to habitat quality. The importance of sto- 
chasticities, particularly as applied to birth and death 
rates in viability analyses can be of overwhelming im- 
portance (Wissel and Zaschke 1994). Thus, tech- 
niques that refine stochastic modeling are encouraged. 
PVAs that used habitat patches of differing quality to 
stratify stochastic simulations have been conducted 
(Table 60.1), and the process is analogous to stratify- 
ing a sample to reduce statistical variance. Similar to 
these studies, our approach uses an organism-based 
scalar hierarchy linked to habitat (see Roloff and 
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Figure 60.1. Viability relationship and associated home 
ranges for white-headed woodpeckers (Picoides albolarvatus) 
in the Idaho Southern Batholith. See Roloff and Haufler (1997) 
for details on the development of the viability relationship. 


Haufler 1997) for refining the stochastic simulations 
of PVA. Scales in the hierarchy include home ranges, 
subpopulations, and populations (Roloff and Haufler 
1997). 

As part of the coarse filter-fine filter ecosystem 
management strategy, Haufler et al. (1996, 19993) 
contended that the basic unit of the fine filter check 
for species viability was the individual home range 
(Roloff and Haufler 1997). Briefly, our habitat-based 
framework for PVA consists of four steps. The first 
three steps involve home-range-level analyses and in- 
clude the identification and mapping of habitat 
based on the organism's life requisites, the develop- 
ment of criteria used for identifying home range via- 
bility, and application of the viability criteria to the 
habitat assessment to map home ranges (Roloff and 
Haufler 1997). The contribution of these home 
ranges to viability is based on an estimate of habitat 
quality with the assumption that higher-quality home 
ranges as scored and mapped in this process will 
offer more-favorable conditions for survival and re- 
production (Roloff and Haufler 1997). The final step 
of the framework involves spatial analyses at the 
subpopulation and population levels. The framework 
as a whole offers the opportunity to integrate organ- 
ism demographics and habitat quality in the PVA 
(Boyce 1992). 

Scoring (as to viability potential) and mapping indi- 
vidual home ranges provides a means to link opera- 
tional-level decision making to population viability for 
planning. Given that resource management affects 
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Figure 60.2. Relationships between home range types and 
contribution to population viability. 


home range quality, size, and location, a method to 
index the magnitude and direction of these manage- 
ment effects was desired. To that end, Roloff and 
Haufler (1997) proposed that home ranges could be 
plotted as a viability relationship (Fig. 60.1). Viability 
relationships are species specific, and details of their 
development are provided elsewhere (Roloff and Hau- 
fler 1997). The axes of the viability relationship are 
habitat quality and home range size (Fig. 60.1). In ad- 
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dition, the quality axis is divided into viable, mar- 
ginal, and nonviable areas (Roloff and Haufler 1997; 
Fig. 60.1). Viable home ranges are assumed to have 
the lowest adult mortality and to consistently produce 
young (Fig. 60.2; Roloff and Haufler 1997). Marginal 
home ranges are assumed to exhibit variable adult 
mortality and reproductive rates depending on the 
availability of resources. During good resource years, 
these home ranges will contribute to population via- 
bility similar to viable ranges; however, during lean re- 
source years these ranges will not contribute to popu- 
lation viability (Fig. 60.2; Roloff and Haufler 1997). 
Nonviable home ranges are assumed to rarely con- 
tribute to population viability regardless of resource 
availability (see Fig. 60.3 in color section; Roloff and 
Haufler 1997). Operational-level manipulations to 
habitat will alter the quality and area of home ranges, 
and, thus, their relative locations in the viability rela- 
tionship will change. The magnitude, direction, and 
number of home range shifts can be used to estimate 
the effects of management activities on population vi- 


ability. 


An Example: 
White-headed Woodpecker Viability 


To demonstrate the process of identifying, scoring, 
and mapping home ranges as an index to population 
viability, we conducted a habitat-based viability analy- 
sis for white-headed woodpeckers (Picoides albolarva- 
tus) according to the framework of Roloff and Haufler 
(1997) for the Idaho Southern Batholith (Fig. 60.3). 
The Idaho Southern Batholith is 2.3 million hectares, 
a size that corresponds to some ecosystem manage- 
ment efforts (Haufler et al. 1996, 1999a,b). These 
large areas are useful for assessing population viability 
of most species and for establishing strategic planning 
goals; however, it is difficult to demonstrate and quan- 
tify the effects of individual management projects on 
viability over areas that large. We used habitat poten- 
tial modeling (Roloff and Haufler 1997) to score 
white-headed woodpecker nesting and foraging habi- 
tat quality on a scale of 0 to 190, with 100 represent- 
ing optimum habitat conditions. Scores were based on 
a mathematical representation of the vegetation re- 
sources required by white-headed woodpeckers to per- 


sist and reproduce. Nesting and foraging scores were 
assigned to map pixels in the assessment area. Subse- 
quently, nesting and foraging map pixels were aggre- 
gated into home ranges according to Roloff and Hau- 
fler (1997). Each home range was ranked on the same 
scale of 0 to 100 according to the quality and amounts 
of nesting and foraging habitats that were used to de- 
lineate the home range. 

Individual home ranges were plotted on the viabil- 
ity relationship depicted in Figure 60.1 according to 
their size (ordinate) and mean habitat quality score 
(abscissa) for the entire 2.3-million-hectare assessment 
area. For purposes of this demonstration, we assumed 
that scores greater than or equal to 50 represented vi- 
able home ranges, scores from 30 to 49 represented 
marginal home ranges, and scores from 1 to 29 repre- 
sented nonviable home ranges (Fig. 60.1). Thus, all 
home ranges with an average habitat-quality score 
greater than 50 in Figure 60.1 were considered viable. 
In addition, to understand the precision of our model 
projections, we calculated 90 percent confidence inter- 
vals for our habitat model output according to Bender 
et al. (1996). The process used by Bender et al. (1996) 
quantifies the variability associated with model input 
data (i.e., the measurement of habitat attributes). 
Also, the map pixel aggregation process includes ran- 
domness (Roloff and Haufler 1997), and, thus, differ- 
ent spatial patterns can result from different model 
runs. We spatially aggregated each bootstrap iteration 
three times and calculated the mean number of home 
ranges. Thus, output from our home range analysis in- 
cludes estimates of vegetation sampling error and ran- 
domness in the home range aggregation process. It is 
important to note that the bootstrap process only ac- 
counts for the error associated with model input data 
(Bender et al. 1996); it does not represent the validity 
of the habitat model. 

Using the viability thresholds in Figure 60.1, the 
model identified a total of 4,640 (4,091—4,915) (mean 
and 90 percent confidence interval) white-headed 
woodpecker home ranges in the Batholith landscape: 
348 (172-616) viable, 1,965 (1,632-2,165) marginal, 
and 2,312 (2,161-2,465) nonviable. Figure 60.1 
shows the results from one of the model iterations. 

To simulate the effects of an operational-level man- 
agement activity on both localized and population- 
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level viability, we selected 890 hectares centered on 
an area of marginal to low-quality white-headed 
woodpecker home ranges in the Batholith (Fig. 
60.3). The 890-hectare area is characterized as a mo- 
saic of open shrubland and mid- to high-elevation 
dry forest communities. Dominant habitat type 
(sensu Daubenmire 1952) classes included cool, 
moist Douglas-fir (Pseudotsuga menziesii); cool, dry 
Douglas-fir; warm, dry, subalpine fir (Abies lasio- 
carpa); and high-elevation subalpine fir (Haufler et 
al. 1996, 1999b). Forest structure was dominated by 
medium (30.5-50.8 centimeters diameter at breast 
height [dbh]) and large (greater than 50.8 centime- 
ters dbh) single- and multistoried stands. We identi- 
fied 52 hectares of large-tree, densely stocked, single- 
storied forests on the warm, dry subalpine fir habitat 
type class for treatment (Fig. 60.3b). These forests 
consisted primarily of mature Douglas-fir and sub- 
alpine fir and thus were not considered high-quality 
white-headed woodpecker habitat (Garrett et al. 
1996). Therefore, we predicted that the proposed 
treatment would have a minimal effect on white- 
headed woodpecker viability, and we used our 
framework to quantify and document this effect. 

We simulated a historical range of variability harvest 
prescription on the selected 52 hectares and generated 
pre- and post-treatment viability relationships for the 
890-hectare assessment area (Figs. 60.4a,b). Histori- 
cally, these forest types in central Idaho were subjected 
to a fifty- to ninety-year fire mosaic that maintained 
sparsely stocked stands of Douglas-fir and lodgepole 
pine (Pinus contorta). We simulated a prescription that 
thinned the subalpine fir from below and retained 
patches of large Douglas-fir in the overstory. Pretreat- 
ment white-headed woodpecker habitat analysis indi- 
cated that zero (0-1; 90 percent confidence interval) vi- 
able, three (0-4) marginal, and one (0-6) nonviable 
home ranges were located within the 890-hectare as- 
sessment area (Fig. 60.4a). Post-treatment habitat 
analysis indicated that zero (0-1) viable, four (1-6) 
marginal, and zero nonviable (0-5) home ranges were 
located in the assessment area (Fig. 60.4b). Consistent 
with our prediction, treatment appeared to have a neg- 
ligible effect on localized white-headed woodpecker via- 
bility. Similarly, in the context of the entire Idaho 
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Figure 60.4. Pre- and post-treatment viability relationships for 
white-headed woodpeckers (Picoides albolarvatus) in the 890- 
hectare treatment assessment area ([a] and [b], respectively). 


Southern Batholith (Fig. 60.1), the treatment had a neg- 
ligible effect on population-level viability. 


Assessing the Viability of 
Multiple Species 


For individual species, the habitat-based PVA process 
outlined above will provide planners and managers a 
means to evaluate the effects of alternative manage- 
ment scenarios on viability. For example, instead of 
simulating only an historical range of variability tim- 
ber harvest as described above, one could also simu- 
late a clearcut prescription and compare the effects of 
the different prescriptions on white-headed wood- 
pecker viability. But what about the hundreds of other 
species potentially impacted by proposed projects? As 
noted earlier, the magnitude of species numbers and 
computational complexity of the problem necessitates 
a coarse-filter approach. Only by properly addressing 
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a coarse-filter approach can maintenance and en- 
hancement of both biological diversity and ecosystem 
integrity (see Haufler et al. 1996) be attained. How- 
ever, to assure proper functioning of the coarse filter, 
viability of selected individual species needs to be 
checked (Haufler et al. 1996, 1999b). To keep the 
number of PVAs required for a legitimate check of the 
coarse filter manageable, species should be selected 
using ecological rationale (Haufler et al. 1996, 
1999b). Thus, this process uses an ecological indicator 
species approach (Morrison et al. 1998), where 
species are used to represent ecological communities 
found in the assessment area (Haufler et al. 1996, 
1999b). The limitations of using indicator species 
have been discussed elsewhere (Mannan et al. 1984; 
Patton 1987; Landres et al. 1988); however, indicator 
species are likely useful for monitoring general ecosys- 
tem conditions (Kremen 1992). 

In our example, we selected the white-headed 
woodpecker as a species that represented the sparsely 
stocked, large-tree, understory-fire-maintained pon- 
derosa pine (Pinus ponderosa) ecological community 
that was historically a dominant feature of low- 
elevation sites in the Idaho Southern Batholith land- 
scape. To demonstrate our approach for species repre- 
senting a diversity of coarse-filter conditions, we also 
conducted habitat-based viability assessments for 
pileated woodpeckers (Dryocopus pileatus) and dusky 
flycatchers (Empidonax oberholseri) in the 890- 
hectare assessment area. In areas representative of the 
Idaho Southern Batholith, pileated woodpeckers have 
been shown to represent a more closed-canopy, large- 
tree community with an abundance of snags, downed 


TABLE 60.2. 


wood, and tree defects (Bull 1987). In the Idaho 
Southern Batholith, this vegetation community histor- 
ically occurred in riparian areas and on mesic sites 
that were less susceptible to a recurring understory fire 
regime. Dusky flycatchers are considered habitat gen- 
eralists and represent more open-forest conditions 
with moderately dense understories and ground cover 
(Dobkin 1994). This type of community is more typi- 
cal of potential vegetation types (sensu Daubenmire 
1952) that support dense shrubs and that are not sub- 
jected to periodic understory burns. 

Habitat-based viability results for the 890-hectare 
treatment area indicated that pileated woodpecker 
habitat was scarce for both pre- and post-treatment 
conditions (Table 60.2). Only one nonviable pileated 
woodpecker home range was identified (Table 60.2). 
Similarly, no viable dusky flycatcher home ranges were 
identified both pre- and post-treatment in the assess- 
ment area; however, the model identified twenty-eight 
and twenty-five marginal home ranges pre- and post- 
treatment, respectively (Table 60.2). Although the 
treatment resulted in one additional dusky flycatcher 
home range, it caused a reduction in the number of 
marginal home ranges (Table 60.2). Overlapping 90- 
percent confidence intervals for the dusky flycatcher 
data suggest that these differences are not significant 
(Bender et al. 1996). Thus, using the habitat-based ap- 
proach for assessing viability, the silvicultural treat- 
ment was estimated to have negligible effect on the 
three species (Table 60.2). 

Our multispecies demonstration used species that 
represent a portion of the ecological complexity found 
in the Idaho Southern Batholith. For the Batholith, we 


Pre- and post-treatment viability relationships for white-headed woodpecker (Picoides albolarvatus), pileated 
woodpecker (Dryocopus pileatus), and dusky flycatcher (Empidonax oberholseri) for the 890-hectare 


treatment area in the Idaho Southern Batholith.? 


Species 
White-headed Pileated 
Home woodpecker woodpecker Dusky flycatcher 
range type Pre Post Pre Post Pre Post 
Viable 0(0-1) 0(0-1) 0 0(0-8) 0(0-7) 
Marginal 3(0-4) 4(1-6) 0 28(21-38) 25(14-31) 
Nonviable 1(0-6) 0(0-5) 1(0-1) 1(0-1) 6(0-18) 10(5-16) 


aValues represent mean and 90% confidence intervals. 
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suspect that the number of species-specific PVAs re- 
quired in a legitimate check of the coarse filter in a re- 
source plan would range from twenty to forty. The ac- 
tual number would depend on landscape complexity, 
the number of protected species that must be consid- 
ered, and the variety of management alternatives 
considered. 


Integrating Home Ranges into 
PVA Stochastic Sampling 


For most species managed under the NFMA, the habi- 
tat-based level of analysis discussed above will suffice 
for planning. However, some species that are rare or 
protected under ESA will require a more comprehen- 
sive PVA. A comprehensive PVA considers all aspects 
of viability, including habitat; genetic; demographic, 
environmental, and catastrophic stochasticities; com- 
munity interactions; and spatial structure. In this 
chapter, we focus on demographic and environmental 
stochasticities and spatial structure. Genetics, catas- 
trophes, and community structure are important in 
some PVAs (e.g., Hedrick and Miller 1992; Haig et al. 
1993; Lacy 1997); however, extinction is usually more 
affected by demographic and environmental stochas- 
ticities (Lande 1988a,b; Boyce 1992). Genetic and cat- 
astrophic stochasticities and community interaction 
models (e.g., competition models, predator-prey mod- 
els) may be integrated at the appropriate level in the 
process described herein (Fig. 60.5). 

We recommend the use of PVAs consistent with 
previous efforts that have used habitat to stratify sto- 
chastic modeling (e.g., Akcakaya et al. 1995; Linden- 
mayer and Lacy 1995b; Root 1998), but we advocate 
the use of home ranges instead of habitat patches as 
the planning unit. The distinction between demo- 
graphic and environmental stochasticities in PVA is 
subtle and conforms to the organism-subpopulation- 
population hierarchy discussed earlier (Fig. 60.5). We 
propose stratifying demographic simulations in PVAs 
by home range types (viable, marginal, and nonviable) 
that form a subpopulation. This approach is analo- 
gous to the "environmental states" modeling de- 
scribed by Beissinger (1995). Classifying environments 
into "states" has been used as a method by demogra- 
phers to measure environmental predictability (re- 


viewed by Beissinger 1995). By definition (Roloff and 
Haufler 1997), the demographics of viable and nonvi- 
able home ranges will be less variable than marginal 
ranges (Fig. 60.2). The purpose of stratifying demo- 
graphic variation by home range type is to more con- 
sistently represent these differences in the stochastic 
simulation (Beissinger 1995). 

Environmental stochasticity applies to subpopula- 
tions as delineated by groups of home ranges. The key 
to expressing subpopulation averages for vital rates 
that retain the home-range-specific demographics is to 
portray the subpopulation mean and associated data 
distribution in conformity with home range composi- 
tion. For example, if a subpopulation is dominated by 
viable home ranges, environmental simulations should 
occur from a data distribution that conforms to this 
skewness. Several PVA computer programs automate 
this simulation process (Table 60.1), but we are not 
aware of any that accommodate data distributions 
that deviate from normal. Monte Carlo simulation 
works well for normally distributed data’ and small 
sample sizes (Miller and Lacy 1999). For non-normal 
distributions, bootstrapping has also been used 
(Sæther et al. 1998), but bootstrapped simulations 
based on low sample sizes are suspect. ` 
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Figure 60.5. The components and hierarchical levels of popu- 
lation viability analysis (PVA). 
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After simulating the effects of demographic and envi- 
ronmental stochasticities on the population, the final 
step in the PVA is to analyze the spatial structure of sub- 
populations or conduct a metapopulation analysis. 
Metapopulation analysis is based on the notion that 
subpopulations are spatially structured into assemblages 
of local breeding populations and that immigration and 
emigration among subpopulations have an effect on 
local population dynamics (Hanski and Simberloff 
1997). Some metapopulation parameters (e.g., number 
of emigrants over time; dispersal potential) for the sub- 
populations can be estimated from the habitat-based 
and stochastic analyses discussed earlier. Subpopula- 
tions that represent *local breeding populations" can be 
identified from mapped viable home ranges. It is impor- 
tant to note that we are not a priori recognizing the 
population as a metapopulation. Metapopulations have 
distinct characteristics (under the classical definition) 
and as such are rare in real systems (Harrison and Tay- 
lor 1997). Rather, we recommend using the tools associ- 
ated with metapopulation analyses (Harrison 1993; 
Hanski 1996; McCullough 1996; Hanski and Gilpin 
1997) to identify and describe the spatial structure of 
the subpopulations and population. 

Spatial structure in PVA influences species viabil- 
ity at two scales: within subpopulations and between 
subpopulations (if more than one exists). Within a 
subpopulation, managers must consider the quality 
of habitat between viable home ranges, distance be- 
tween viable home ranges, movement capabilities of 
the organism, organism susceptibility to mortality 
while in transit, and life history spacing strategies 
(Roloff and Haufler 1997). Between subpopulations, 
the size, shape, and relative contribution of each sub- 
population have important consequences for popula- 
tion persistence and stability (Maurer 1994). The 
questions facing resource managers include whether 
these subpopulations interchange, the contribution 
of individual subpopulations to overall population 
structure and stability, and the resiliency of the sys- 
tem to landscape-level dynamics (e.g., fire, drought). 
Metapopulation analysis was developed to assist 
managers in addressing these questions and software 
has been developed to aid in these analyses (e.g., 
RAMAS/space; Akcakaya and Ferson 1990; Possing- 
ham et al. 1992; RAMAS-GIS; Akçakaya 1994). 


Conclusions 


As outlined above, we envision our habitat-based 
PVA approach as a hierarchically structured process 
for indexing the persistence of organisms over time. 
Components of the hierarchy include the organism, 
subpopulations, and population (Fig. 60.5). AI- 
though the components of a PVA can be associated 
to specific levels in the hierarchy, the effects of each 
component cascade through all levels and are ulti- 
mately expressed as animal numbers and distribu- 
tions and probability of persistence (Fig. 60.5). 
Habitat pervades all of these levels and is an ultimate 
determinant of viability (Gilpin 1987; Lande 1987; 
Lawton et al. 1994; Hanski et al. 19962; Fahrig 
1997; Drechsler and Wissel 1998). Here, we pre- 
sented a spatially explicit approach to population vi- 
ability modeling that accounts for individuals, sub- 
populations, and populations within a landscape and 
that offers a means for defining spatial relations be- 
tween habitat patches. 

In a complex system such as the environment that 
influences organism viability, validation of the as- 
sumptions, models, linkages, and processes described 
here is critical (Morrison et al. 1998). It is naive to 
think we can consistently model organism viability 
with complete accuracy. Nonetheless, land manage- 
ment decisions that influence viability are made on a 
daily basis, and, thus, we as wildlife professionals 
must offer the tools for making informed, defensible 
decisions. Validation is the key to credibility, and it 
should be used to understand and quantify sources of 
error. Validation should occur at all levels in the 
process described above, starting with the assump- 
tions, variables, relationships, and outputs of the 
species habitat model, and transcending through all 
levels of the stochastic hierarchy (Fig. 60.5). Model er- 
rors propagate through a process like the one de- 
scribed here (Bender et al. 1996; Morrison et al. 
1998), and, thus, it is critical to minimize and incor- 
porate that error into outputs. We encourage model 
developers and users to rigorously validate their mod- 
eling processes. In addition, decision makers should 
scrutinize model error and understand the limitations 
of using model output in an inherently variable 


environment. 
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Relations between Canopy Cover 
and the Occurrence and Productivity of 
California Spotted Owls 


Carolyn T. Hunsaker, Brian B. Boroski, and George N. Steger 


ffective land management for multiple uses such 

as wildlife habitat, timber production, and fuels 
management requires a multiscaled landscape plan- 
ning approach (Raphael and Holthausen, Chapter 
62). This chapter addresses forest structural character- 
istics associated with the California spotted owl (Strix 
occidentalis occidentalis) in the southern Sierra 
Nevada of California. The California spotted owl was 
designated as a Sensitive Species on national forests 
throughout California in the late 1970s, and the 
northern spotted owl (Strix occidentalis caurina) was 
federally listed in 1990 as a threatened species. Previ- 
ous studies have shown a strong association between 
both subspecies of spotted owls and mature and old- 
growth forests at the scales of (1) home ranges, 
(Blakesley et al. 1992; Verner et al. 1992; Hunter et al. 
1995; Bart 1995b; Franklin et al. 2000) and (2) habi- 
tat components within home ranges (LaHaye 1988; 
Solis and Gutiérrez 1990; Verner et al. 1992). Franklin 
et al. (2000) found that climate and habitat models 
could account for 88.5 percent of the total observed 
process variation in owl reproduction. 

This chapter evaluates three hypotheses: (1) no dif- 
ference between the composition of canopy-cover 
classes in the study area as a whole and sites used by 
spotted owls, (2) no difference in the composition of 
canopy-cover classes among analysis areas exhibiting 
different levels of occupancy and productivity, and (3) 


no difference in conclusions about relations between 
owl occupancy and productivity and canopy-cover 
classes based on aerial photography or Landsat The- 
matic Mapper imagery. First, we tested for differences 
in the proportion of canopy-cover classes in the study 
area as a whole and sites used by spotted owls. If owls 
are selecting areas based on canopy cover, the next 
questions to answer are what classes relate to occur- 
rence over time and what classes correspond to differ- 
ent levels of productivity? We therefore tested for dif- 
ferences in the composition of canopy-cover classes 
among analysis areas exhibiting different levels of oc- 
cupancy and owl productivity. Last, we recognized 
that different sources of canopy-cover data might re- 
sult in a different composition of the canopy-cover 
classes within the landscape. Thus, we tested whether 
the rejection of the first two hypotheses was influenced 
by the data source (aerial photography and satellite 
imagery) used to derive canopy-cover values. 

The study was done in the Sierra National Forest in 
the Sierra Nevada of California (Fig. 61.1). The study 
area encompassed about 60,600 hectares and ranged 
in elevation from 853 to 2,743 meters. Five vegetation 
types, as described in Mayer and Laudenslayer (1988), 
occur in the study area: montane hardwood-conifer, 
ponderosa pine (Pinus ponderosa), Sierran mixed- 
conifer, white fir (Abies concolor), and red fir (Abies 
magnifica). 
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Figure 61.1. location of spotted owl study area within the Sierra National Forest and the state of California. 
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Methods 


Spotted owls were monitored from 1990 through 
1998. Owl activity centers are areas within which 
owls find suitable nesting sites and several suitable 
roosts and in which they do a substantial amount of 
their foraging (Zabel et al. 1992). Activity centers are 
smaller than both territories and home ranges. The 
mean home-range size was estimated by the minimum 
convex polygon (MCP) method (Mohr 1947; Odum 
and Kuenzler 1955; and Jennrich and Turner 1969) to 
be 728 + 319 hectares for individual owls during the 
breeding season in conifer forests of the Sierra Na- 
tional Forest (Verner et al. 1992). For our study, the 
central location of an activity center is a single geo- 
graphic location where owls were observed roosting 
or nesting repeatedly over time; we refer to this loca- 
tion as a site. 


Owl Demographics 


Roosts and nests were located using methods de- 
scribed by Forsman (1983) and based on actual owl 
observations. Sex, age class, pair status, nesting status, 
and reproductive success of each owl were determined 
(Forsman 1981, 1983). Reproductive success at each 
site was determined through repeat observations (min- 
imum of six) of adult owls and their fledglings (Steger 
et al. 1993). 

A productivity score was assigned to each site 
every year it was surveyed to protocol standards 
(Steger et al. 1993). Values ranged from 0 to 9 based 
on the presence of owls, their attempt to nest, and 
the number of fledglings produced. A nonsequential 
series was used to represent the extra energy required 
to nest and produce fledglings. Sites that were sam- 
pled and determined to be empty were assigned a 
value of zero while a score of 1 was given to those 
containing a single bird. Sites with non-nesting pairs 
and pairs that nested but failed to fledge young were 
given scores of 2 and 4, respectively. Lastly, scores of 
7, 8, and 9 were assigned to sites producing one, 
two, and three fledglings. A summary productivity 
index was calculated for each site across the years it 
was surveyed by summing the yearly scores for the 
site and dividing by the number of years it was sur- 
veyed (Table 61.1). 


Estimation of Owl Activity Centers 


Telemetry data were not available for owls at every 
site, but a subset of spotted owls was captured, fitted 
with radio transmitters, and tracked from spring 1987 
through fall 1990. The number of locations per owl 
ranged from thirty-two to sixty-seven (means and 
standard deviations by year: 1987: 45.6, 6.9; 1988: 
56.6, 12.4; 1989: 38.5, 11.0) and typically repre- 
sented single nightly locations that were mapped on 
7.5-minute-series maps (1:24,000-scale U.S. Geologi- 
cal Survey, Menlo Park, CA 94025). We used data 
from these owls to calculate generic owl analysis 
areas. We considered it reasonable to represent the 
generic owl analysis areas as circles because owls fit a 
central-place foraging pattern. To calculate these cir- 
cles, we first estimated 50 percent, 70 percent, and 90 
percent MCP home-range estimates for each radio- 
tagged owl and represented home range areas as cir- 
cles having the radii rso, r70, and roo. Mean values for 
these radii were calculated from all radio-tagged owls 
having at least twenty-five locations between sunset 
and sunrise throughout the breeding season. Using a 
geographic information system (GIS) (ARC/INFO 
Version 7.0, Environmental Systems Research Insti- 
tute, Redlands, CA 92373), these mean radii were 
used to generate three concentric circles of 72, 168, 
and 430 hectares around the central location of each 
owl activity center. Here, we call these circles owl 
analysis areas and use them to represent areas likely to 
be used by owls. 


Vegetation Classification 


Tree canopy-cover values came from two different veg- 
etation data sets, one based on aerial photography and 
the other based on Landsat Thematic Mapper imagery. 
For the aerial photography (hereafter referred to as 
photography), vegetation was classified by G. N. Steger 
into habitat communities representing a homogeneous 
unit of vegetation for overstory, understory, and base 
material (soil or rock). The key features used to deter- 
mine homogeneity in the stands were size and spacing 
of trees and the species and density of the understory 
vegetation. A minimum mapping unit of 1 hectare was 
used for delineating vegetation class polygons, although 
some features such as meadows, lakes, and houses were 


TABLE 61.1. 


Productivity indices for California spotted owls (Strix occidentalis occidentalis) on the Sierra National Forest in 
California. 


Score by year? 


Summary Years 
Site ID 1990 1991 1992 1993 1994 1995 1996 1997 1998 scoreb surveyed 


3 2 1 1 0 0 0 0 0.57 7 
4 7 2 8 2 8 2 2 2 4 4.11 9 
5 0 1 1 0 1 0.60 5 
6 7 1 1 2 2 2 4 2.71 7 
9 2 1 0 0 0.75 4 
15 2 0 0 0 0 0.40 5 
25 8 7 7 2 4 4 2 1 7 4.67 9 
31 o 1 0 1 0.50 4 
33 8 7 7 2 2 2 0 4.00 7 
35 2 2 9 2 7 2 7 2 2 3.89 9 
36 2 1 0 0 0 0 0 0 0.38 8 
38 2 2 8 8 2 2 2 2 2 3.33 9 
41 2 2 4 7 4 4 4 2 4 3.67 9 
43 2 9 2 2 2 2 0 2 2.63 8 
48 1 2 7 2 2 2 2 2 0 2:22 9 
49 8 2 9 4 4 4 2 4 0 4.11 9 
53 8 2 8 8 2 1 2 2 1 3.78 9 
57 7 8 2 0 0 0 0 2.43 7 
58 2 7 8 2 2 2 2 2 2 3.22 9 
61 y 7 8 8 4 2 2 2 7 5.22 9 
62 8 2 2 8 2 0 0 3.14 7 
64 1 9 2 4 2 8 4 4.29 7 
65 7 2 7 8 2 4 4 2 1 4.11 9 
67 2 2 8 4 7 2 2 2 2 3.44 9 
70 2 1 T 1 2 1 0 1 1.13 8 
77 8 2 9 7 2 2 2 2 1 3.89 9 
80 8 Z 0 1 1 0 0 0 1.50 8 
83 1 2 9 2 2 0 0 0 0 1.78 9 
84 2 2 8 2 2 2 2 2 2 2.67 9 
87 2 7 2 y 2 8 8 2 4.75 8 
91a 8 4 8 2 4 4 2 2 4 4.22 9 
91b 4 4.00 1 
100 1 2 8 2 8 2 2 2 2 3.22 9 
219 2 2 1 2 2 1.80 5 
221 7 9 2 8 2 4.20 5 
225 2 0 0 0.67 3 
227 2 9 2 2 2.00 4 
228 2 2 2 2 2 2.00 5 
229a 8 2 2 1 2 3.00 5 
229b 1 2 2 2 1.75 4 
230 2 2 2 2 2 2.00 5 
234 2 2 2 2 2 2.00 5 
239 8 2 8 2 0 4.00 5 
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TABLE 61.1. (Continued) 


OON 


Productivity indices for California spotted owls (Strix occidentalis occidentalis) on the Sierra National Forest in 


California. 


Score by year? 


Summary Years 
Site ID 1990 1991 1992 1993 1994 1995 1996 1997 1998 score” surveyed 
241 8 D 2 2 8 4.40 5 
244 2 7 2 2 dL 2.80 5 
245 8 2 2 2 2 3.20 5 
247 8 2 2 8 7 5.40 5 
257 2 (0) 1.00 2 
266 2 0 0 0.67 3 


a0 = no owls; 1 = single owl; 2 = non-nesting pair; 4 = failed nest; 7 = one fledgling; 8 = two fledglings; 9 = three 


fledglings. 


bSummary score = (sum of annual scores / years surveyed). 


portrayed using a resolution of 0.5 hectare. Vegetation 
typing from aerial photographs followed the guidelines 
from Avery (1978). Using a stereoscope, lines delineat- 
ing polygons were drawn on 22.9 by 22.9 centimeter 
vertical photographs flown in 1996. These images were 
geocorrected when transferred onto orthophotoquads 
(U.S. Geological Survey, Menlo Park, CA 94025) and 
then scanned into a GIS (CartaLinx, Clark Lab, Clark 
University, Worcester, MA 01610-1477). 

The structural characteristics derived from the pho- 
tography included crown cover of trees with a crown 
diameter greater than 8 meters, crown cover of all 
trees, and crown cover of all vegetation with each 
being subdivided into five canopy-cover categories 
(State of California Resource Agency 1969). The type 
of vegetative cover and dominance of land-cover 
types—conifer, hardwood, shrub, grass, ground, rock, 
or cultivated land—were recorded in rank order from 
highest to lowest within the polygon. Crown-cover in- 
crements of 5 percent were used for conifer, hard- 
wood, and shrub types, and our classes were based on 
these values. For this analysis, the vegetation and 
structural data from the photography data have not 
been extensively groundtruthed; however, this is being 
done in 2000-2001. 

The second data set for canopy cover was a 1995 
classified image from the Landsat Thematic Mapper 
satellite (hereafter referred to as Landsat); the classi- 
fied image was produced by San Diego State Univer- 


sity (J. Franklin personal communication). The classi- 
fied image had a pixel size of 30 meters and repre- 
sented canopy cover in 10 percent increments using a 
canopy reflectance model. We ran a 7x7 modal filter 
followed by a 3x3 modal filter on this image so the 
mean patch size per class was similar to the photogra- 
phy data (F = 0.87, df = 4,6426, P = 0.48). Averaging 
attributes over areas larger than the single grid cell 
tends to reduce errors in the estimates of those geo- 
graphic phenomena that are spatially autocorrelated. 
Although formal accuracy assessment is not available 
for the Landsat data, the producers of these data be- 
lieve that the canopy cover falls within two cover 
classes (20 percent) of the actual cover most of the 
time and within one cover class much of the time (J. 
Franklin personal communication). 

For this analysis, forest structure was represented 
as five categories of canopy closure: 0-19 percent, 
20-39 percent, 40—49 percent, 50—69 percent, and 
70-100 percent. The planimetric proportions of each 
canopy-cover class were derived for the three concen- 
tric owl analysis areas around each owl activity center. 


Canopy Cover of Owl Analysis Areas 


We examined second-order habitat selection (Johnson 
1980) by testing the null hypothesis that spotted owls 
arbitrarily selected areas to use based on the available 
composition of canopy-cover classes between the 
elevations of 853 and 2,743 meters within the study 
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area. The maximum likelihood statistics resulting 
from the compositional analysis method described by 
Aebischer et al. (1993) were used to determine signifi- 
cant departures from random for the concentric analy- 
sis areas within the 60,600-hectare study area. 

The above approach was also used to examine 
third-order selection (Johnson 1980) by testing the 
null hypothesis that the composition of canopy-cover 
classes within the 72-hectare analysis area was no dif- 
ferent from the composition within the 430-hectare 
analysis area. An additional compositional analysis 
tested the null hypothesis that the central locations of 
the analysis areas for our middle size (168-hectare) 
analysis areas reflected the classes’ availabilities within 
these areas, ranking the classes based on their relative 
occurrence as central locations (Aebischer et al. 1993). 

The occupancy of activity centers by owls varied 
throughout the study period (Table 61.1). Using a gen- 
eral linear model (SPSS 1998), we examined the rela- 
tionship between the occupancy of a site (no birds, 
single bird, or pair) and yearly variation, the canopy- 
cover class at the center of the activity center, and the 
proportion of moderate to dense (50-100 percent 
canopy cover) and sparse to open habitat (0-39 per- 
cent canopy cover) within the 72-hectare analysis 
area. At this stage of analysis, multivariate models are 
being used to explore relations between habitats and 
owl occurrence and productivity. 

Productivity scores were compared to the propor- 


TABLE 61.2. 


tions of canopy-cover classes within the analysis areas 
to determine whether productivity correlated well with 
the classes. We calculated Pearson correlation coeffi- 
cients for groups of classes: classes 1 and 2, and 3 
through 5. Using a generalized additive model for this 
binary data (S-PLUS 4 1997; Hastie and Tibshirani 
1990), we evaluated the relations between observed 
site characteristics and the success of activity centers (a 
successful activity center was defined to mean that it 
includes a nesting pair of owls). Explanatory variables 
included in the model were year, the canopy-cover class 
at the center of the activity center, and a smooth func- 
tion of the percentage of the analysis area within the 
40-49 percent, 50-69 percent, and 70-100 percent 
canopy-cover classes. Using the same variables, we 
used a generalized additive model to explore the rela- 
tions between the canopy-cover classes in an analysis 
area and the number of fledglings at a successful site. 


Results 


Nine owls in 1987 and eight each in 1988 and 1989 
met the criteria of having at least twenty-five night- 
time locations. The mean size of estimated home 
ranges for owls was similar across years for MCP esti- 
mates based on 50 percent, 70 percent, and 90 percent 
of the radio locations (Table 61.2). The three concen- 
tric owl analysis areas calculated from the mean radii 
for individual home ranges were 72, 168, and 430 


Minimum convex polygon home ranges (hectares) of California spotted owls in the Sierra National Forest, Fresno County, 


California. 


_— SS eee ee 


Means? 


50% of locations 


70% of locations 90% of locations 


Year n area SD radius area SD radius area SD radius 
1987 9 73.8 44.3 467 194.2 99.6 763 442.9 193.4 1159 
1988 8 iors 36.9 468 169.4 97.2 713 457.0 230.5 1175 
1989 8 83.4 37.4 503 16519 62.9 714 458.5 228.9 1176 
All years 76.7 38.5 479 17O 86.0 TM 452.4 208.1 1170 
Test statistic 0.80 057 0.001 

Probability 0.67 0.75 0.99 


aMean radius (m) represents the average radius of circles having areas equal to the individual home-range estimates. Test statistics 
comparing home-range area across the three years are from Kruskal-Wallis one-way analysis of variance tests, assuming a chi-square 


distribution with two degrees of freedom. 
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hectares. Figure 61.2 depicts these analysis areas for 
site 61 with respect to the canopy-cover classes de- 
rived from photography and from Landsat. 

Forty-nine owl activity centers were surveyed 
within the study area from 1990 to 1998. The number 
of years surveyed during the nine-year period varied 
but averaged seven years per site (range one to nine 
years). The distributions for occurrence and produc- 
tivity values were non-normal and reflected the ten- 
dency for sites to be occupied by non-nesting pairs of 
birds and pairs that nested but failed to fledge young 
(median summary score for productivity = 3, range 
0.38 to 5.40). Pairs of owls were observed at the 
forty-nine sites nearly 80 percent of the time (median 
occurrence = 2, sd = 0.72). 


Vegetation Composition 


Although the proportion of a given canopy-cover class 
within the study area varied depending on the method 
used to characterize.the landscape (Table 61.3), both 
methods resulted in similar distributions of available 
habitat (Kolmogorov-Smirnov two-sided probability = 
0.82) (Fig. 61.3). Landsat data depicted more of the 
landscape in canopy-cover classes of 50-69 percent 
and 70-100 percent as compared to photography 
data. Mean polygon size differed significantly among 
classes for both the photography (F = 4.80, df = 
42,710, P < 0.001) and Landsat data (F = 4.23, df = 
43,716, P = 0.002). The mean size of polygons depict- 
ing areas with at least 70 percent canopy cover was 
significantly greater than for canopy cover classes 
from 0 to 49 percent when photography data were 
used and significantly greater than the 20-39 percent 
and 40-49 percent classes when Landsat data were 
used (Table 61.3). The 70-100 percent and 0-19 per- 
cent canopy-cover classes were dominant in the pho- 
tography data set as compared to the Landsat data set, 
in which the 50-69 percent and 70-100 percent 
classes were dominant. 

Compositional analyses based on both photogra- 
phy and Landsat data sets for canopy cover lead us to 
reject the hypothesis that the selection of sites used by 
owls was random within the study area. The likeli- 
hood ratio statistics were 41, 30, and 34 for the pho- 
tography data and 69, 45, and 43 for the Landsat data 
with reference to the 72-hectare, 168-hectare, and 
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Figure 61.2. The canopy-cover classes estimated from aerial 
photography and Landsat Thematic Mapper data are shown for 
the three different sizes of owl analysis areas around site 61. 
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Figure 61.3. Distribution of canopy-cover classes for the study 
area by data source: aerial photography and Landsat Thematic 
Mapper. 
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TABLE 61.3. 


Data on available habitat classes derived from aerial photography and Landsat Thematic Mapper data 
for a 60,600-hectare study area on the Sierra National Forest. 


% Canopy Area % of Number of x polygon Standard Class 
Class closure (ha) landscape polygons size (ha) deviation differences? 
Aerial photo 
1 70-100 16,894 28 453 Siu 123w 3,4,5 
2 50-69 9,459 ills} 410 29M SOF 
3 40-49 5,940 10 341 17.4 34.8 al 
4 20-39 11,556 19 561 20.6 T36 al 
5 0-19 16,732 28 950 17.6 88.3 1 
Landsat Thematic Mapper 
al 70-100 21,784 36 514 42.4 471.7 3,4 
2 50-69 17,987 30 871 20.7 78.1 
3 40-49 3,734 6 815 4.6 "3 1 
4 20-39 4,481 7 751 5.9 LO al 
5 0-19 12,540 21 770 MOSS 62.8 


*Class differences based on mean polygon size were determined by an analysis of variance test followed by a post 
hoc Tukey multiple comparison test. Significant differences between classes were based on an alpha level of 5 


percent. 


430-hectare analysis areas (in all cases df = 4, P < 
0.00001). Results from both data sets also rejected the 
null hypothesis that the composition of habitat classes 
within the 72-hectare area was not different from that 
in the 430-hectare area (likelihood ratio statistics 
equaled 47 (photography) and 56 (Landsat), df = 4, P 
< 0.00001). Last, we examined the canopy-cover 
classes for the polygon at the center of the analysis 
area and determined that the selection of these central 
locations within the 168-hectare areas was signifi- 
cantly nonrandom (likelihood ratio statistics equaled 
28 (photography) and 19 (Landsat), df = 4, P < 
0.001). Rankings of the classes based on their relative 
use within the analysis areas were different for the 
photography and Landsat data. The photography data 
resulted in the ranked order of 1 > 3 > 2 > 5 > 4 while 
the Landsat data returned a ranking of 1 > 4 > 3 > 5> 
2, where underlined classes reflect no significant dif- 
ference at an alpha level of 5 percent. 


Owl Occupancy 


The results from general linear models examining the 
relations between occurrence, yearly variation, and 


canopy cover were similar for both data sets, with 
the exception of the factor that described the 
canopy-cover class at the activity center. In both 
models, year was significant (photography F = 2.8, 
df = 8, 315, P = 0.005, Landsat F = 3.5, df = 8, 312, 
P = 0.001) as was the proportion of the 72-hectare 
area comprising habitat with canopy cover between 
0 and 39 percent (photography F = 49.4, df = 1, 315, 
P< 0.001; Landsat E= 5:3, dfi; 349; R= 0022). 
In both models, the occupancy of sites increased as 
the proportion of the area having canopy-cover val- 
ues between 0 and 39 percent declined, but the pro- 
portion affecting occupancy differed for photogra- 
phy and Landsat data sets (Fig. 61.4). In the model 
with Landsat data, the canopy-cover class at the ac- 
tivity center was also a significant variable (F = 6.8, 
df = 3, 312, P < 0.001). When the canopy cover at 
the activity center was between 20 and 39 percent, 
occupancy averaged 0.75 birds (sd = 0.87), which 
was significantly less (Tukey multiple comparison 
test P < 0.02) than when the crown cover at the ac- 
tivity center was greater than 50 percent (cover 
50-69 percent, mean = 1.51 birds, sd = 0.83; cover 
70-100 percent, mean = 1.71 birds, sd = 0.63). 
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Figure 61.4. The relations between canopy cover and the oc- 
currence of spotted owls within 72-hectare areas surrounding 
known owl activity centers for two image sources. Data on the 
annual occupancy of the activity centers was based on nine 
years of data from the Sierra National Forest, Fresno County, 
California. 


Owl Productivity 


For both image types and across the three analysis 
areas, scores of productivity were positively correlated 
with the proportion of the analysis area having greater 
than or equal to 50 percent canopy-cover and nega- 
tively correlated with those having less than 50 
percent cover (Table 61.4). The median proportions 
for these classes for unproductive sites (productivity 
scores less than or equal to 2) and productive sites 
(productivity scores greater than 2) were different for 
estimates based on the aerial photography and Land- 
sat imagery (Fig. 61.5). Mann-Whitney U test statis- 
tics for each analysis area by breeding status combina- 
tion returned probability values of less than 0.0001 
for comparisons between proportions derived from 
photography and Landsat imagery. The proportions of 
the analysis areas in habitat classes with greater than 
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or equal to 50 percent canopy cover were substantially 
greater for sites with productivity scores greater than 
2 (Table 61.5). As analysis areas increased in size, the 
trend was for a greater proportion of the areas to be 
in classes 3, 4, and 5 irrespective of the productivity 
score and type of imagery. The median proportions 
for classes 1 and 2 were consistently larger for analysis 
areas having a productivity score greater than 2 for 
photography data and larger or almost the same for 
Landsat data. For all analysis areas, the median pro- 
portions of the 0-19 percent canopy-cover class (un- 
suitable owl habitat) were larger in the areas having 
productivity scores less than 2. 

Models of productivity using canopy-cover esti- 
mates from photography data had more significant 
class determinants than models using data from Land- 
sat. The generalized additive model for binary data 
(owl[s] present or owl[s] productive) using photogra- 
phy identified year (P « 0.001) and the proportion of 
the area within 70-100 percent (P = 0.047) and 40-49 
percent (P = 0.036) canopy-cover classes as significant 
factors affecting the probability of the site consistently 
supporting breeding activity. The logistic model using 
Landsat identified only year (P « 0.001) and the pro- 
portion of the area in the 40—49-percent canopy-cover 
class (P = 0.011) as significant factors. 

The generalized additive model for the number of 
fledglings at a site, which used canopy-cover esti- 
mates from photography, indicated that the canopy- 


TABLE 61.4. 


Pearson correlation coefficients between productivity scores 
and the proportion of canopy-cover classes derived from aerial 
photography and Landsat data. 


Analysis Classes Classes 

Image source x area? (ha) 1752 P 3,4,5 P 

Photography T2 0.30 0.04 -0.30 0.04 
Landsat 72 0.29 0.04  -0.29 0.04 
Photography 168 0.35 0.014 -0.35 0:01 
Landsat 168 0.36 0.01 -0.36 0.01 
Photography 430 oT 0.01 -0.37 0.01 
Landsat 430 Ors 0.02 -0.33 0.02 


aProbability values are given for three owl-analysis areas (72 hectares, 
168 hectares, and 430 hectares) centered on owl nest sites or 
primary roost sites and are based on Bartlett's chi-square statistic 
with one degree of freedom. 


TABLE 61.5. 


Proportion of canopy-cover classes within owl analysis areas? on the Sierra National Forest derived from aerial 
photography and Landsat imagery. 


cu Productivity scores^ < 2 Productivity scores“ > 2 
class Imagery He Median Range SD 969 Median Range SD 
72-hectare area 
70-100 A 94 0.52 0.00-0.94 0.24 100 0.53 0.06-0.92 0123 
i 100 0.61 0.12-0.98 (0825 100 0.67 0.09-1.00 0.26 
50-69 A 72 0.04 0.00-0.62 0.18 84 0.13 0.00-0.78 0.20 
[L 100 0.22 0.02-0.77 0.22 94 0.20 0.00-0.68 0.22 
40-49 A 72 0.03 0.00-0.29 0.08 68 0.04 0.00-0.56 0.14 
L 61 0.02 0.00-0.14 0.04 58 0.00 0.00-0.16 0.04 
20-39 A 94 0.16 0.00-0.53 0.17 87 0.07 0.00-0.41 0.10 
L 50 0.00 0.00-0.25 0.07 55 0.00 0.00-0.13 0.03 
0-19 A 89 Ons 0.00-0.42 (ET: 90 0.07 0.00-0.38 0.08 
L 83 0.03 0.00-0.42 0.10 65 0.01 0.00-0.30 0.08 
168-hectare area 
70-100 A 94 0.46 0.00-0.89 0.21 100 0.48 0.16-0.84 0.18 
L 100 0.61 0.09-0.93 0.24 100 0.59 0.10-0.98 0.23 
50-69 A 89 0.09 0.00-0.44 ONS 87 omis 0.00-0.62 0.16 
L 100 0.17 0.05-0.63 0.18 100 0.24 0.01-0.58 0.18 
40-49 A 89 0.04 0.00-0.33 0.09 Tu 0.06 0.00-0.48 OM. 
L 83 0.02 0.00-0.22 0.06 87 0.02 0.00-0.13 0.03 
20-39 A 100 0.20 0.00-0.47 0.13 100 ORFS 0.01-0.33 0.09 
IL 83 0.03 0.00-0.31 0.09 an 0.01. 0.00-0.15 0.04 
0-19 A 100 0.19 0.00-0.35 0.10 97 0.10 0.00-0.31 0.07 
L 89 0.09 0.00-0.34 0.09 90 0.06 0.00-0.29 0.07 
430-hectare area 
70-100 A 100 0.39 0.00-0.73 0.19 100 0.43 0.14-0.74 0.14 
|. 100 ODT 0.14-0.80 0.20 100 0.53 0.15-0.85 0.20 
50-69 A 100 (Qt 0.00-0.43 (01d 100 oniy 0.00-0.47 Ox 
L 100 0.26 0.07-0.53 OS 100 0.29 0.05-0.55 0.14 
40-49 A 94 0.08 0.00-0.38 0.10 97 0.08 0.00-0.34 0.09 
L 100 0.03 0.00-0.18 0.05 100 0.03 0.00-0.15 0.03 
20-39 A 100 0.21 0.03-0.45 Omi 100 0.16 0.03-0.32 0.07 
[ 94 0.04 0.00-0.18 0.06 97 0.02 0.00-0.16 0.04 
0-19 A 100 0.19 0.05-0.36 0.10 100 OMET 0.03-0.26 0.07 
it 100 Ori 0.02-0.34 0.08 100 0.09 0.00-0.23 0.06 
an = 49. 


Aerial photography = A; Landsat = L. 


Eighteen analysis areas had productivity scores less than or equal to two. 
‘Thirty-one analysis areas had scores greater than two. 
eThe number of times a class type was present in an analysis area is expressed as the percentage (%) of the total number of areas. 
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Landsat Image Source 


NS Summary scores > 2 and canopy cover > 50% 


Aerial Photo Image Source 


WA Summary scores > 2 and canopy cover > 50% 


HH Summary scores > 2 and canopy cover < 50% 
Im Summary scores < 2 and canopy cover > 50% 
IL. Summary scores < 2 and canopy cover < 50% 
HH Summary scores > 2 and canopy cover < 50% 
re Summary scores < 2 and canopy cover > 50% 
[perm] Summary scores < 2 and canopy cover < 50% 


Area (ha) 
and for foraging and unsuitable habitat (50 percent canopy cover or less, classes 3, 4, 


and 5 combined) by productivity score groups. Productivity scores greater than 2 indicate breeding success while scores 


less than or equal to 2 indicate nonbreeding situations. 


) 


Figure 61.5. Median proportion in each owl analysis area for nesting and good foraging habitat (50-100 percent canopy 


cover, classes 1 and 2 combined 
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cover of the habitat containing the center of the 
analysis area (P = 0.026), year (P < 0.001), and the 
proportion of the area in the 50-69-percent (P = 
0.026) and 40-49 percent (P = 0.010) canopy-cover 
classes were significant. In contrast, results of the 
generalized additive model using Landsat indicated 
that year (P < 0.001) was the only significant factor 
explaining the number of fledglings produced at a 
site. The proportion of the area in the 50-69 percent 
cover class was not significant (P = 0.692), but the 
proportion of the area in the 40-49 percent class (P 
= 0.081) and the habitat containing the center of the 
analysis area (P = 0.053) would be significant at an 
alpha level of 10 percent. 


Discussion and Conclusions 


The main focus of this research was to evaluate the in- 
fluence of different vegetation data sets on the rela- 
tions between canopy cover and the occurrence and 
productivity of the owls. Our research builds on the 
work of Verner et al. (1992), who evaluated forest 
characteristics for roost, nest, and home-range loca- 
tions with respect to all available habitats in the Sierra 
Nevada. We analyzed relations between forest 
canopies and the occupancy and productivity of owl 
activity centers over the past nine years. Results tested 
the following three hypotheses and rejected them all: 


* No difference in the proportion of canopy-cover 
classes in the study area as a whole and sites used 
by spotted owls. 

* No difference in the composition of canopy-cover 
classes among analysis areas exhibiting different 
levels of owl productivity. 

* No difference in conclusions about relations be- 
tween owl occurrence and productivity and canopy- 
cover classes based on aerial photography or Land- 
sat Thematic Mapper imagery. 


Owls selected use areas at three scales: (1) activity 
centers as represented by our analysis areas within the 
study area, (2) core areas within activity centers, and 
(3) roosting or nesting habitat. Results show that the 
California spotted owl is selecting for specific forest 
structure, especially near the central location of their 
activity center (72-hectare and 168-hectare areas) and, 


furthermore, that the composition of the area relates 
to the occupancy of the site by owls through time. 
Verner et al. (1992) found the mean size of nest stands 
to be 40 hectares in the Sierra Nevada. Our results 
corroborate the findings of Verner et al. (1992): that 
California spotted owls are habitat specialists and, for 
nesting, they select stands with relatively closed 
canopies (greater than 70 percent). They also found 
that owls foraged significantly more than expected in 
stands with greater than or equal to 70 percent 
canopy cover and significantly less than expected in 
stands with 0—39 percent canopy closure. Laymon's 
(1988) and Call’s (1990) studies suggest that spotted 
owls in the Sierra Nevada tended to forage in stands 
of intermediate to older ages. 

Hunter et al. (1995) and Meyer et al. (1998) found 
that landscape characteristics had the highest levels of 
significance between random sites and sites used by 
northern spotted owls in the Klamath province when 
circles with radii of 0.8 kilometer were used for analy- 
sis areas as compared to larger circles centered around 
the same central locations. Franklin et al. (2000) used 
a 0.71-kilometer-radius circle (one-half the median 
nearest-neighbor distance among thirty-seven territory 
centers) to represent spotted owl territories—similar 
to our 168-hectare analysis area with a radius of 0.73 
kilometer. Meyer et al. (1998) suggested that charac- 
teristics of these core areas may be most influential in 
determining territory locations for northern spotted 
owls. 

We conclude that canopy-cover relates to owl 
occurrence and productivity within our study area. 
Productivity scores were significantly correlated 
with canopy closure, and canopy-cover classes were 
significant in the multivariate models. For sites that 
consistently produced young, the median proportion 
of good habitat (canopy cover greater than 50 per- 
cent) was usually about 10 percent higher than for un- 
productive sites (based on photography data in Fig. 
61.5). The values ranged from 75 percent of the 72- 
hectare analysis area to 60 percent of the largest 
analysis area. When canopy cover was based on Land- 
sat data, the difference between productive and un- 
productive sites remained, but the magnitude de- 
creased to a 5-7 percent difference. Landsat data 
suggest that a higher proportion (80 to 90 percent) of 
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the owl analysis area needs to be in good habitat. Re- 
call, however, that Landsat depicted 23 percent more 
of the landscape as having 60-100 percent canopy 
cover. Our results from the photography data agree 
with Bart and Forsman (1992), who reported that 
areas with less than 40 percent suitable owl habitat 
supported lower densities of spotted owls, and pairs 
had lower reproduction than in areas with greater 
than 60 percent suitable owl habitat. 

Although correlations between reproduction and 
canopy cover in our data were not large, they were 
consistent. Classes 1 and 2 were positively correlated 
and classes 3, 4, and 5 were negatively correlated with 
our productivity index for all three sizes of analysis 
area. From these correlations, one would conclude 
that the threshold between canopy cover values that 
contribute to or detract from occurrence and produc- 
tivity is a value near 50 percent. Indeed, the occur- 
rence of owl pairs at sites declined as the proportion 
of habitat with 0—39 percent canopy cover increased 
in the area surrounding the activity center. A differ- 
ence of 10 percent in cover (say between 40 and 50 
percent, or between 50 and 60 percent) is small and 
can easily be similar to the magnitude of uncertainty 
in canopy cover from remotely sensed data and in field 
measurements made by different people or measure- 
ment devices. 

Results from our multivariate models are logical 
given the spotted owl's strong association with late 
seral-stage forests for nesting and roosting and with 
early seral stages for prey sources. Zabel et al. (1995) 
found spotted owls foraging near edges of late and 
early seral-stage forests, and Ward et al. (1998) re- 
ported that woodrat abundance at sites where owls 
foraged was greatest at the ecotone between late and 
early seral stages. Franklin et al. (2000) found repro- 
ductive output to depend on a substantial amount of 
edge habitat, a low amount of core area, and a cer- 
tain amount of habitat fragmentation. His models in- 
dicated that changes in reproductive output were 
most sensitive to changes in edge, while our results 
indicate that changes in canopy-cover composition of 
less than 10 percent can significantly affect occu- 
pancy. 

Franklin et al. (2000) stress the importance of cli- 
mate for temporal variation in life-history traits of the 
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northern spotted owl. Unseasonably cold and wet 
events in late spring are thought to be a major factor 
in annual productivity. The year variable was always 
significant in our models, and we assume that it is a 
surrogate for such events. A more-refined analysis 
could use the actual number and magnitude of such 
climate events. 

The results of compositional analyses and correla- 
tions between owl habitat and productivity were simi- 
lar for the two data types. Aerial photography and 
Landsat estimates for percentage of the landscape in 
each of the five canopy classes (Fig. 61.5) differed; 
moreover, conclusions about occurrence and what 
constitutes productive owl habitat would be different 
(Figs. 61.3 and 61.5). Landsat data suggest that a 
greater proportion of the landscape around an activity 
center needs to have canopy cover values above 50 
percent. 

Conclusions about the importance of the canopy- 
cover class in the area immediately surrounding a nest 
site or primary roost site differed between the Landsat 
and photography data. The habitat at the activity cen- 
ter was a significant factor when examining occur- 
rence with the Landsat data set but not with the pho- 
tography data. The opposite was found, however, 
when the generalized additive model was used to test 
relations between canopy cover and the number of 
fledglings produced. The models developed from the 
Landsat data concerning productivity typically con- 
tained fewer variables relating to canopy cover. At this 
stage of analysis, multivariate models are being used 
to explore relations between habitats and owl occur- 
rence and productivity, and not for predictive pur- 
poses. The photography data seem to provide more in- 
sight into the relations that we examined, but formal 
accuracy assessments are needed on the vegetation 
data sets before we can conclude which data source is 
more accurate and whether differences between results 
are real. 

Our results provide further insights into the types 
of data required for long-term maintenance of spotted 
owls and the potential bias of using different sources 
of data for management decisions. Results such as 
ours will be used to set standards and guidelines for 
vegetation management within the distribution of the 
California spotted owl. Empirical relations between 
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spatial habitat data and spotted owl occurrence/ 
productivity could be used to model suitable habitat 
or predict responses of the owls to forest management. 
Further research needs to address additional forest 
structural attributes, landscape patterns (especially 
edge), and the uncertainty in the spatial habitat data 
prior to such model development. 
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Using a Spatially Explicit Model to 
Analyze Effects of Habitat Management 
on Northern Spotted Owls 


Martin G. Raphael and Richard S. Holthausen 


C of the northern spotted owl (Strix 
occidentalis caurina) has been a central issue in 
struggles over forest management in the Pacific 
Northwest (Thomas et al. 1990; USDI 1992; FEMAT 
1993; Hunsaker et al., Chapter 61). A recent demo- 
graphic analysis (Forsman et al. 1996a) indicated that 
the population of the northern spotted owl has de- 
clined over large portions of its range, but this report 
does not relate rates of population change to varia- 
tion in habitat quality over the species’ range. Be- 
cause of the finding that the owl population has de- 
clined over the past ten years, likely in response to 
habitat loss (Murphy and Noon 1992), the effect of 
further harvest of habitat is of great interest. À major 
step in managing federal lands to conserve the owl 
was the development of a series of options for ecosys- 
tem management on federal forest lands by the Forest 
Ecosystem Management Assessment Team (FEMAT 
1993). Adoption of one of those options as a strategy 
for managing late-successional and old-growth forests 
on federal forestlands in the range of the owl has be- 
come known as the Northwest Forest Plan. Over 
the long term, the habitat reserve design that is part 
of this plan will likely support stable and well- 
distributed populations of owls when unsuitable 
habitat within the reserves has matured (Thomas et 
al. 1990; USDI 1992; Murphy and Noon 1992; 
Thomas et al. 1993; Raphael et al. 1994, 1998). 


However, the Northwest Forest Plan is focused on 
management of spotted owl habitat on federal lands. It 
is of interest to consider the degree to which additional 
contributions of habitat on nonfederal lands might fur- 
ther support spotted owl populations. The analysis 
presented in this chapter was originally requested by 
the U.S. Fish and Wildlife Service to provide back- 
ground for development of a section 4(d) rule that 
might authorize incidental take of some northern spot- 
ted owls on nonfederal forests on the Olympic Penin- 
sula, Washington (see Fig. 62.1 in color section). This 
analysis describes likely patterns of distribution and 
persistence of owls on the Olympic Peninsula under the 
provisions of the Northwest Forest Plan, and benefits 
to the owl population of varying levels of habitat con- 
tribution from nonfederal lands. This chapter also ex- 
tends earlier results reported elsewhere (Holthausen et 
al. 1995). Holthausen et al. (1995) reported on simu- 
lated owl population responses to a series of hypothet- 
ical scenarios that might be considered in the develop- 
ment of a possible 4(d) rule for spotted owl 
management on nonfederal lands. One of the actions 
that had been considered by the U. S. Fish and Wildlife 
Service (FWS) was adoption of a special rule under sec- 
tion 4(d) of the Endangered Species Act that addressed 
the conservation of northern spotted owls while reduc- 
ing the prohibition against incidental take of owls in 
the course of timber harvest and related activities on 
nonfederal lands. A Notice of Intent was published on 


vor! 


702 PREDICTING SPECIES OCCURRENCES 


29 December 1993 that identified a series of Special 
Emphasis Areas where the FWS believed that mainte- 
nance of nonfederal habitat was necessary. The FWS 
was further evaluating those areas and additional alter- 
natives to develop a proposed 4(d) rule. The question 
posed for this analysis was *Can the contribution of 
nonfederal lands be made more efficiently than 
through the current take guidelines?" The analysis fo- 
cused on alternative scenarios for retention of nonfed- 
eral habitat throughout the Olympic Peninsula and 
within the western Special Emphasis Area (SEA). Here, 
a more efficient scenario was defined as one that re- 
quired less contribution of nonfederal habitat for simi- 
lar benefits to owl conservation compared with current 
take guidelines. 


Methods 


We based our analysis on the same digital map of 
nesting, roosting, and foraging (NRF) habitat used by 
Holthausen et al. (1995). This map was assembled by 
the Washington Department of Natural Resources 
(WDNR 1997) and is an aggregation of the best avail- 
able information from a variety of different sources, 
including the FEMAT (1993) habitat classification for 
federal lands and data from the Washington Depart- 
ment of Fish and Wildlife for lands administered by 
the WDNR (Fig. 62.2). On remaining state lands, the 
WDNR used inventoried stands (land use/land cover 
database) that met NRF habitat definitions. For all 
other lands, the WDNR used a satellite-derived forest- 


All habitat B Take habitat 
D 
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Miles 


Figure 62.2. Current nesting, roosting, and foraging habitat of the northern spotted owl (Strix occidentalis caurina) on the Olympic 


Peninsula, Washington. Figure (A) depicts habitat under Scenario 2, retention of habitat on all lands; (B) depicts habitat remaining 
under Scenario 3, application of take guidelines; (C) depicts habitat remaining under Scenario 1, removal of all habitat on nonfed- 
eral lands; and (D) depicts Scenario 4, retention of habitat within the western Special Emphasis Area (SEA) (see text). 


62. Using a Spatially Explicit Model to Analyze Habitat Management 703 


cover classification (Green et al. 1993) and selected as 
habitat either the late-successional category (generally 
stands with greater than 10 percent cover of trees 
greater than 53 centimeters [21 inches] in diameter at 
breast height [dbh]) or a combination of the late- 
successional and mid-successional (less than 10 per- 
cent cover of 53-centimeter-dbh trees) categories, or 
whichever had the best fit to classified owl habitat 
within a particular planning unit. We assumed static 
habitat conditions, that is, we did not model growth 
in habitat over time, nor did we model the effects of 
catastrophic fire. In addition, for each of the new 
land-management scenarios, we assumed, based on 
advice provided by the FWS, that no habitat would be 
retained on tribal lands. 


Spotted Owl Population Simulation 


To evaluate the relative likelihood of persistence of the 
northern spotted owl under alternative land allocation 
scenarios, we used a spatially explicit life-history simu- 
lator (OWL [version 2.01] McKelvey et al. 1992). This 
model is a single-organism simulator that is based 
largely on models developed by Lande (1987, 19882) 
and Lamberson (Thomas et al. 1990; Lamberson et al. 
1992, 1994) and is similar to Pulliam's BACHMAP 
model (Pulliam et al. 1992). The model is sensitive to 
the shape and location of high-quality habitat, which 
we mapped as described above. The model should be 
viewed as a tool for landscape design that allows a log- 
ical framework in which to assess qualitative differences 
in various land management plans in regard to popula- 
tion dynamics of the northern spotted owl. 

We parameterized the model as described in 
Holthausen et al. (1995). The primary relationships 
built into the simulation model were variation in fe- 
cundity and adult survivorship in relation to amount 
of NRF habitat. These relationships were derived 
from regressions reported by Bart (1995b): 


f = 0.32 + 0.54p 
Sa = 0.63 + 0.39p 


where f is fecundity, s, is adult survival, and p is the 
proportion of the area surrounding an owl activity 
center that is composed of NRF habitat. We used 
these regressions to establish rule sets relating percent- 


TABLE 62.1 


Adult survival and fecundity in relation to percent suitable 
habitat within hexagonal cells (spotted owl sites). 


Percent suitable habitat 
0-20 220-30  »30-40 >40-60 » 60 


Adult survival ` 0.76 0.82 0.86 0.92 0.92 
Fecundity 0.24 0.33 0.38 0.46 0.46 


Parameter 


age of suitable habitat in owl territories to demo- 
graphic parameters (see Fig. 62.3 in color section). 
These relationships were adjusted to match demo- 
graphic data reported by Forsman et al. (1996b) from 
their study on the Olympic Peninsula. For this adjust- 
ment, we retained the slopes of the regression reported 
above but adjusted the intercepts to match the 
Olympic results. In addition, to avoid implausibly 
high rates, we truncated the survival and fecundity pa- 
rameters in the highest-quality cells (Table 62.1). 

These parameters (and others listed by Holthausen 
et al. 1995), were used as input to the simulation 
model and were labeled “rule set B.” In this rule set, 
we used an estimate of juvenile survival of 0.29, the 
median from the values in Burnham et al. (1996) for 
eleven study areas throughout the owl’s range. Rule set 
A was identical to rule set B except that the parameters 
were shifted to the right (into the next-highest habitat 
category); the parameters of rule set C were shifted to 
the left (into the lower habitat category). A fourth rule 
set, rule set D, was identical to rule set B except that ju- 
venile survival was increased to 0.38, the estimate from 
Burnham et al. (1996) that was adjusted for emigration 
of juveniles out of study areas. 

As in these previous analyses, we defined hexagonal 
cells of 1,500 hectares and initialized the model with 
pairs of owls wherever habitat exceeded 30 percent of 
a hexagonal cell. The cell size of 1,500 hectares was 
selected based upon field data on the observed density 
of owls and to achieve a carrying capacity in line with 
estimated population size on the Peninsula (see 
Holthausen et al. 1995 for details). To describe rela- 
tive variability of simulation runs (each run consisting 
of fifty separate simulations), we replicated an entire 
sequence of runs ten times and tabulated variation in 
simulation results for rule sets B and D. Variation is 
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due to stochastic elements of the simulation model it- 
self. To gain some insight into validity of the model 
projections, we compared projected occupancy within 
each cell to observed locations of spotted owls as iden- 
tified in agency databases. We performed this compar- 
ison with occupancy from rule sets B and D. 


Land Management Scenarios 


Most scenarios were reproduced from Holthausen et 
al. (1995) and represented the likely retention of non- 
federal NRF habitat within various alternatives (Fig. 
62.2). Scenario 1 represented the retention of only fed- 
eral habitat. Scenario 2 represented the retention of all 
currently existing nonfederal and federal NRF habitat. 
Scenario 3 represented current take guidelines where 
nonfederal habitat was retained only within owl cir- 
cles of 40 percent habitat or less. This scenario serves 
as an appropriate benchmark against which to com- 
pare alternative Peninsula-wide approaches. Scenario 
4W represented retention of nonfederal habitat fol- 
lowing take guidelines within the western SEA. 
Scenario 5 was also taken from Holthausen et al. 
(1995), where it was presented as one approach to- 
ward a more efficient mechanism for retention of non- 
federal habitat. Efficiency was evaluated as the ratio of 
the gain in owl occupancy or population size to the 
gain in amount of nonfederal habitat retained under a 
scenario relative to the gain expected between Scenar- 
ios 1 and 2. As seen in Holthausen et al. (1995), Sce- 
narios 1, 2, and 3 fall along an essentially straight line, 
where more habitat translated to greater numbers of 
owls. Any scenario falling above that line can be con- 
sidered more efficient; such a scenario would indicate 
a relatively greater contribution to the population of 
owls per unit of nonfederal habitat. As noted by 
Holthausen et al. (1995), Scenario 5 was developed by 
identifying habitat that occurred in hexagonal cells 
with successively greater predicted occupancy by pairs 
of owls as evaluated at the conclusion of 100-year 
simulation runs. Any nonfederal habitat in a cell sup- 
porting less than or equal to 10 percent mean occu- 
pancy by pairs of owls was removed, then nonfederal 
habitat in a cell with less than or equal to 20 percent 
occupancy, less than or equal to 40 percent occupancy, 
and less than or equal to 60 percent occupancy was 
successively removed. After each removal, a map of 


remaining habitat was created, and a new simulation 
run was completed. 

Scenario 6 is a new set based on removal of habi- 
tat without regard to occupancy by owls. For this 
set, we first eliminated any habitat that occurred on 
tribal land. We then eliminated nonfederal habitat in 
any hexagonal cell where habitat was less than or 
equal to 20 percent of that cell and ran the owl pop- 
ulation simulation on the resulting habitat map. 
Next, we identified cells with less than or equal to 40 
percent habitat, removed all nonfederal habitat from 
those cells, and ran another simulation. Similarly, we 
successively removed nonfederal habitat at the 60 
and 80 percent levels, running simulations after each 
step. 


Results 


The amount of nonfederal habitat that might be re- 
tained under each of the scenarios varied from 0 
hectares under Scenario 1 to 64,600 hectares under 
Scenario 2 over the entire Peninsula; 32,000 hectares 
were retained under Scenario 3 (Table 62.2). Removal 
of habitat from tribal lands reduced total nonfederal 
habitat by 6,800 hectares. 

Our model validation test revealed general agree- 
ment between occupancy from the simulation model 
and occupancy by observed birds. The owl location 
database contains 169 sites occupied by spotted owls. 
Under rule set B, 168 of the 169 hexagons containing 
known locations had predicted occupancy greater 
than O (errors of omission less than 0.1 percent). 
However, a larger number of hexagons were predicted 
to be occupied at some level but were not actually oc- 
cupied (errors of commission = 56 percent). Under 
rule set D, all 169 sites were predicted to be occupied 
at some level (errors of omission = 0 percent); 73 per- 
cent of sites with predicted occupancy were not actu- 
ally occupied (errors of commission). For both rule 
sets, the errors of commission were most prevalent at 
low levels of occupancy (i.e., over half of the errors in- 
volved cells where expected occupancy was less than 
20 percent). Observed rates of occupancy increased 
with increasing predicted rates of occupancy under 
both rule sets (Fig. 62.4). We believe that errors of 
omission are of greater concern than errors of 
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TABLE 62.2. 


Total area of suitable northern spotted owl (Strix occidentalis caurina) habitat 
(hectares) under each management scenario, Olympic Peninsula.@ 


Scenario Total habitat Nonfederal habitat 
1: Federal habitat only? 265,900 O 
2. Federal and nonfederal 330,600. ~ 64,600 
3. Current take guidelines 297,900 32,000 
4. Realized Special Emphasis Area: West 287,100 21,100 
5. Selective removal: 

Rule A 

< 10% 278,900 13,000 
< 20% 272,000 6,100 
< 40% 266,600 700 
< 60% 266,000 (0) 
Rule B 

< 10% 295,700 29,700 
< 20% 288,600 22,600 
< 40% 280,600 14,600 
< 60% 271,500 5,500 
Rule C 

< 1096 310,400 44,400 
« 2096 300,300 34,300 
< 40% 291,500 25,600 
< 60% 285,000 19,100 
Rule D 

< 10% 306,200 40,300 
< 20% 298,900 32,900 
< 40% 290,100 24,100 
< 60% 283,800 17,800 
6. Federal and nonfederal, not including tribald 323,800 57,900 
< 20% 300,800 34,800 
< 40% 277,300 117300 
< 60% 269,600 3,700 
« 80% 267,800 1,800 


aAll values rounded to the nearest 100 hectares. 


bIncludes national forest and national park land only (does not include minor contributions of 


other federal land). 


cSelective removal of nonfederal habitat in cells based on successively higher mean 


occupancy (Scenario 5, see text for details). 


dSelective removal of nonfederal habitat based on successively higher percentage of habitat in 


cells (Scenario 6, see text for details). 


commission: if the model predicted absence at a site 
and we actually observed a pair of owls at that site, 
we would be concerned that our model was missing 
important habitat attributes. That the model predicted 


low rates of occupancy in sites where no owls were 
present in the current database may simply mean we 
did not observe that site over enough years to record 
an owl there. The simulation model ran for one 
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TABLE 62.3. 


Range? of results associated with simulating northern spotted owl population dynamics 
using rule sets B and D under Scenario 2, federal and nonfederal habitat, Olympic 


Peninsula. 
Criterion Rule B Rule D 
Number of cells > 60-80% 
mean occupancy 120-125 (5) 109-121 (12) 
Number of cells > 80% mean occupancy 0 138-144 (6) 
Total of cells > 60% mean occupancy 120-125 (5) 248-260 (12) 
Mean number of pairs 173-177 (4) 282-287 (5) 
Pairs at year 60 149-153 (4) 277-283 (6) 
Pairs at year 100 127-137 (10) 274-282 (8) 
A> Evaluated for years 60-100 0.9960-0.9973 0.9992-1.0003 
(0.0013) (0.0011) 


aRanges are the extreme values generated from ten separate simulation runs for each rule set 
(each run consisted of fifty separate simulations). Simulations assumed that current vegetation 
patterns were static (no regrowth), and all habitat currently on the Olympic Peninsula was retained. 


Values in parentheses are ranges. 
bFinite rate of population change. 


Rule Set B 


0 >0-20 220-40  >40-60 >60-80 >80 


Rule Set D 


Observed Mean Occupancy (%) 


[n 20-20 220-40 240-60 260-80 >80 


Predicted Mean Occupancy (%) 


Figure 62.4. Observed occupancy (known northern spotted 
owls [Strix occidentalis caurina]) in relation to predicted occu- 
pancy under two rule sets on the Olympic Peninsula, Washing- 
ton. Predicted occupancy is the percentage of years in which a 
hexagonal cell was occupied by a pair of owls during a set of 
simulation runs (see text for details). 


hundred years whereas our owl database represents 
only five years of observation. 

The results of population simulations based on each 
of the scenarios show that some scenarios do result in 
more-efficient habitat retention. All comparisons should 
be made bearing in mind the expected levels of variabil- 
ity among runs (Table 62.3). As summarized in Figure 
62.5, Scenario 5 and its subparts seem to result in 
greater mean numbers of owls per unit of nonfederal 
habitat. For example, under rule set B, Scenario 5A (re- 
moval based on cells with less than or equal to 10 per- 
cent occupancy) resulted in about the same mean num- 
ber of pairs as Scenario 2 (all nonfederal habitat, 
64,600 hectares), while retaining only half the nonfed- 
eral habitat (29,700 hectares, Table 62.2, Fig. 62.5). 
This scenario also supported more owls per unit non- 
federal habitat than Scenario 3 (current take guidelines) 
under all rule sets except rule set A (Fig. 62.5). 

Removal of habitat based on percent habitat 
within each cell regardless of occupancy was not as 
efficient as Scenario 5. As shown in Fig. 62.5, for 
any given level of nonfederal habitat, mean number 
of pairs was greater under Scenario 5 than under this 
scenario. However, removal of habitat based on its 
percentage within cells was more efficient than cur- 
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Rule Set B 


Projected Mean Number of Pairs 


Rule Set D 


Rule Set C 


Suitable Non-federal Nesting Habitat (ha x 1000) 


Figure 62.5. The relation between nonfederal habitat for each scenario and the expected average number of pairs, evaluated over 
the entire Olympic Peninsula. Scenario 1 models federal habitat only, without regrowth. Scenario 2 includes all existing federal and 
nonfederal habitat without regrowth or harvest. Scenario 3 retains habitat following the current take guidelines. Scenario 4 retains 
nonfederal habitat that occurs only within a western Special Emphasis Area. Scenario 5 saves nonfederal habitat based on its sim- 
ulated contribution to population size; 5A results from removal of nonfederal habitat from cells with mean occupancy of less than 
or equal to 10 percent, 5B from cells of less than or equal to 20 percent, 5C from cells of less than or equal to 40 percent, and 
5D from cells of less than or equal to 60 percent. Scenario 6 is the same as Scenario 2 except that we assumed no habitat on 
tribal lands; 6A is the retention of all nonfederal habitat, 6B is removal of nonfederal habitat from any cell with less than or equal 
to 20 percent habitat, 6C is removal of nonfederal habitat from any cell with less than or equal to 40 percent habitat, 6D and 6E 
result from removal of habitat in cells with less than or equal to 60 percent or 80 percent habitat, respectively. Occupancy patterns 
and mean numbers of pairs were based on simulations using rule sets A, B, C, and D. The diagonal line is a regression of mean 
number of owl pairs against amount of nonfederal habitat for Scenarios 1, 2, and 3. 


rent take guidelines (Table 62.4). For example, under 
rule sets A and B, removal of habitat from cells with 
less than or equal to 40 percent habitat (6B) resulted 
in mean pairs of about the same level as Scenario 3 
but with about one-third the total nonfederal habitat 
(11,300 hectares, 27,900 acres) versus 32,000 
hectares, Table 62.2, Fig. 62.5). Rates of occupancy 
were also greater under this approach (n - 112 cells 
with more than 60 percent occupancy under the 40 
percent removal scenario and using rule set B versus 
108 cells under Scenario 3; Table 62.5). 

The location of habitat that would be retained 
under these approaches (Fig. 62.3) reflects the current 
distribution of higher-quality patches where habitat oc- 
curs at higher concentrations. For example, Scenario 
5C (selective removal when occupancy was less than or 
equal to 40 percent) under rule set B resulted in reten- 


tion of habitat within about 5 kilometers of federal 
lands along the western boundary of the Olympic Na- 
tional Park, the southwestern boundary of the 
Olympic National Forest, and small patches along the 
northern boundary of the national forest. Many of the 
scenarios resulted in a similar distribution of retained 
habitat, although the amounts and exact distributions 
varied among the scenarios and rule sets. 


Discussion 


Discussion of this analysis is organized around the fol- 
lowing questions: 


1. Can the contribution of nonfederal lands to owl 
conservation be made more efficiently than 
through the current take guidelines? 


TABLE 62.4. 


Summary of simulation results under Scenario 5, selective removal of low-occupancy habitat, 
using rule sets A, B, C, and D, Olympic Peninsula. 


Percent mean occupancy? 


Criterion < 10% < 20% <40% <60% 
Rule set A: 

Amount of habitat (ha x 1,000) 279 272 267 266 
Number of cells » 60-8096 mean occupancy 0 0 0 (0) 
Number of cells > 80% mean occupancy (0) 0 0 0 
Total of cells > 60% mean occupancy 0 0 0) (0) 
Mean number of pairs 66 61 57 56 
Pairs at year 60 36 32 27 25 
Pairs at year 100 115 12 9 9 
À^ evaluated for years 60-100 0.9784 0.9755 0.9731 0.9745 
Rule set B: 

Amount of habitat (ha x 1,000) 296 289 281 271 
Number of cells > 60—8096 mean occupancy 124 122 124 ABIES 
Number of cells > 80% mean occupancy (0) 0 (0) (0) 
Total of cells > 60% mean occupancy 121 122 124 slats} 
Mean number of pairs 1T 172 169 152 
Pairs at year 60 154 152 149 134 
Pairs at year 100 139 131 128 aki 
à evaluated for years 60-100 0.9975 0.9962 0.9964 0.9961 
Rule set C: 

Amount of habitat (ha x 1,000) 310 300 292 285 
Number of cells > 60-80% mean occupancy 136 134 133 136 
Number of cells > 80% mean occupancy 127 T30 135 125 
Total of cells > 60% mean occupancy 263 264 268 261. 
Mean number of pairs 290 290 290 276 
Pairs at year 60 289 287 287 273 
Pairs at year 100 286 280 285 267 
À evaluated for years 60-100 0.9998 0.9994 0.9998 0.9995 
Rule set D: 

Amount of habitat (ha x 1,000) 306 299 290 284 
Number of cells 60-8096 mean occupancy 116 119 110 alam 
Number of cells » 8096 mean occupancy 141 134 146 AS 
Total of cells > 60% mean occupancy 257 299 256 254 
Mean number of pairs 285 280 280 2i 
Pairs at year 60 283 276 274 266 
Pairs at year 100 277 278 PTTL 262 
À evaluated for years 60-100 0.9995 1.0002 1.0003 0.9996 


All nonfederal habitat was removed from cells with an indicated mean occupancy percentage based on 
earlier simulations of Scenario 2 (all habitat). 
bFinite rate of population change. 


TABLE 62.5. 


Summary of simulation results under Scenario 6, selective removal of habitat from cells based on 
percent habitat within each cell, using rule sets A, B, C, and D, Olympic Peninsula. 


Percent habitat removed? 


Criterion 0% < 20% <40% <60% < 80% 
Rule set A: p 

Amount of habitat (ha x 1,000) 324 301 277 270 268 
Number of cells > 60-80% mean occupancy 0 0 0 0 0 
Number of cells » 8096 mean occupancy 0 0 0 0 0 
Total of cells » 6096 mean occupancy 0 0 0 0 0 
Mean number of pairs 66 67 63 60 58 
Pairs at year 60 33 34 62 30 28 
Pairs at year 100 12 14 14 12 11 
A» evaluated for years 60-100 0.9763 0.9787 0.9791 0.9781 0.9776 
Rule set B: 

Amount of habitat (ha x 1,000) 324 301 PITT 270 268 
Number of cells > 60-80% mean occupancy quie 109 qu 94 86 
Number of cells » 8096 mean occupancy 0 0 0 O 0 
Total of cells > 60% mean occupancy alye 109 112 94 86 
Mean number of pairs 169 164 161 146 144 
Pairs at year 60 148 139 141 126 123 
Pairs at year 100 119 119 123 105 100 
À evaluated for years 60-100 0.9947 0.9963 0.9966 0.9955 0.9948 
Rule set C: 

Amount of habitat (ha x 1,000) 324 301 gu 270 268 
Number of cells > 60-80% mean occupancy 140 144 151 151: 169 
Number of cells > 80% mean occupancy 113 11505) 81 69 49 
Total of cells » 6096 mean occupancy 253 259 292 220 218 
Mean number of pairs 282 284 249 239 235 
Pairs at year 60 274 280 242 236 227 
Pairs at year 100 275 2T 241 226 215 
À evaluated for years 60-100 1.0001 0.9991 0.9999 0.9990 0.9986 
Rule set D: 

Amount of habitat (ha x 1,000) 324 301 PUTS 270 268 
Number of cells 60-80% mean occupancy 110 alala 107 108 104 
Number of cells » 8096 mean occupancy 138 131 125 109 110 
Total of cells » 6096 mean occupancy 248 246 292 2087 214 
Mean number of pairs 275 275 256 239 239 
Pairs at year 60 270 269 251 232 228 
Pairs at year 100 268 267 250 292 226 
À evaluated for years 60-100 0.9998 0.9998 0.9998 1.0000 0.9998 


aAll nonfederal habitat was removed in cells with the indicated habitat percentage. 


bFinite rate of population change. 
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2. If there is a more efficient way to provide a nonfed- 
eral contribution, what are the characteristics of 
habitat that contribute to it? 

3. To what degree do different levels of nonfederal 
contribution contribute to the persistence of the 
owl population on the Olympic Peninsula? 

4. How could the results of this analysis be used ap- 
propriately in policy decisions? 


Throughout this discussion, any conclusions are 
tempered by general cautions concerning the interpre- 
tation of model results (Holthausen et al. 1995). Be- 
cause the relationship between model runs and reality 
is not known, the relative differences observed among 
land management scenarios are of most interest. The 
different rule sets are used to show how those relative 
differences change based on a change in biological un- 
derstandings or assumptions. 


Can Nonfederal Lands Contribute to Owl 
Conservation more Efficiently Than 
through the Current Take Guidelines? 


Scenario 5 was developed to investigate this question as 
part of the initial analysis done in Holthausen et al. 
(1995). The results for Scenario 5 suggested that more- 
efficient nonfederal contributions could be crafted. That 
scenario selectively removed habitat from the simulated 
landscape based on the occupancy of the habitat 
observed in model results. The results of the scenario 
were viewed with some caution because they used the 
results of one model run to improve the performance of 
the population in a second model run. The new analy- 
ses reported here were designed in part to further vali- 
date the conclusions drawn from Scenario 5. 

The current analyses confirm that configurations of 
habitat different from the take guidelines are more ef- 
ficient in contributing to owl persistence on the 
Olympic Peninsula. This effect can be observed in Fig- 
ure 62.5, which shows the mean number of owl pairs 
maintained through one hundred years in each of the 
scenarios and the hectares of nonfederal owl habitat 
that were retained in the scenarios. Scenario 3 repre- 
sents application of take guidelines throughout the 
Peninsula, Scenario 2 includes all existing federal and 
nonfederal habitat, and Scenario 1 includes only fed- 
eral habitat. The regression line shows the expected 


relationship between nonfederal habitat and owl pairs 
based on these three scenarios. Scenarios are judged 
more efficient than the regression line if they fall 
above it, indicating that they support a greater in- 
crease in number of owl pairs per hectare of habitat. 
To be conclusive, the difference in number of pairs 
supported must be greater than the differences ex- 
pected from the stochastic nature of the model (Table 
62.3). The specific scenarios shown to be more effi- 
cient differ depending on the rule set used. 

Among the scenarios applied to nonfederal land 
across the entire peninsula (Fig. 62.5), the following 
are more efficient than the take guideline under at 
least one rule set: 5A through 5D, 6A, and 6B. In all 
these scenarios, decisions to retain habitat in the land- 
scape are based on either the amount of habitat pres- 
ent in hexagonal cells or the occupancy of cells ob- 
served in model runs that included all existing habitat. 
Thus, these scenarios all selectively retain areas where 
habitat is concentrated and eliminate smaller, scat- 
tered patches. 


What Are the Characteristics of Habitat 
That Provide for an Efficient Contribution 
to Owl Conservation? 


The model results indicate that cells containing larger 
amounts of habitat make a more-efficient contribution 
to owl persistence than do cells containing lesser 
amounts. In addition, the location of those cells rela- 
tive to other large concentrations of habitat appar- 
ently influences efficiency of the habitat in providing 
for owls. The effect of location is observed most 
strongly in Scenario 5 and its subparts. Because both 
the amount of habitat in cells and the location of that 
habitat contributed to occupancy of habitat cells in 
initial runs, the selection of cells for retention was in- 
fluenced by both amount of habitat and the location 
of cells relative to the large concentrations of federal 
habitat. 

The conclusions that amount of habitat in cells and 
location of cells affect relative efficiency of contribu- 
tion seems relatively robust. However, it is much more 
difficult to identify critical values of those characteris- 
tics. Under all rule sets except B, a gain in efficiency 
resulted from selecting cells with more than 20 
percent suitable habitat rather than selecting cells 
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based on the current take guidelines (i.e., based on 
proximity to owl activity centers). However, use of 20 
percent suitable habitat as a critical value should be 
viewed with skepticism, since it is dependent on the 
“structure of the demographic rule sets developed for 
the model (Holthausen et al. 1995). Likewise, the sce- 
narios identified as being efficient (SA, 5B, and 5C 
under rule set B, 5A, 5B, 5C, and 5D under rule set D, 
6A, and 6B) consistently include nonfederal habitat 
that falls within approximately 10 kilometers of the 
federal land. However, the specific value of 10 kilome- 
ters must be viewed with caution, since that value is 
dependent on the structure of the model. 
Examination of the distribution of retained habitat 
under these efficient scenarios (e.g., Fig. 62.2) suggests 
that lands on the western side of the Peninsula provide 
the greatest opportunity for efficient contribution to 
owl conservation. This is likely due to the location of 
those lands relative to federal lands and is consistent 
with findings of Holthausen et al. (1995). However, 
the specific boundary of the most efficient contribu- 
tion is again difficult to identify. Some lands in all por- 
tions of the Peninsula are identified in some of these 
scenarios, but the nonfederal contribution becomes 
more concentrated in the western and southwestern 
portion of the Peninsula as the selection criteria are 
tightened. 


To What Degree Do Different Levels of 
Nonfederal Contribution Affect Owl Persistence 
on the Olympic Peninsula? 


In the initial analysis of owls on the Olympic Penin- 
sula, Holthausen et al. (1995) concluded that reten- 
tion of nonfederal habitat could provide a biologically 
significant contribution to the maintenance of a stable 
population of owls on the Peninsula. However, no 
specific thresholds could be identified in this benefit. 
Population performance simply increased steadily 
with nonfederal habitat contribution (Fig. 62.5). 
Analyses completed for this report suggest that the ef- 
fects of several of the efficiently designed scenarios 
(e.g., 5A) may be similar to the effects of retaining all 
nonfederal habitat (Fig. 62.5). This suggests that re- 
tention of only one-half to two-thirds of the existing 
nonfederal habitat may be highly effective in provid- 
ing for owl persistence. However, this finding should 


again be viewed with caution because it varies across 
demographic rule sets and is dependent on the struc- 
ture and assumptions in the model. The finding also 
does not reflect potential value of habitat in providing 
a connection between the population in the center of 
the Peninsula and the population on the western coast 
of the Peninsula, especially the coastal strip of the 
Olympic National Park. 


What Is the Appropriate Use of 
the Analysis in Policy Decisions? 


The link between the analysis reported here and policy 
formulation is not straightforward. This analysis does 
not use the existing take guideline as an operating as- 
sumption, and thus it cannot be applied directly to de- 
cisions being made as part of the 4(d) rule process. 
Rather, the analysis assumes that some nonfederal 
habitat may be retained in areas that are not in prox- 
imity to owl activity centers, that the take circles 
around other activity centers might be managed to re- 
tain more than 40 percent suitable habitat, and that 
still other take circles might be managed to retain less 
than 40 percent suitable habitat. 

The analysis could be used to identify areas where 
the contribution of nonfederal habitat appears to be 
most significant. We found that areas on the western 
and southwestern sides of the Peninsula within ap- 
proximately 10 kilometers of federal land are com- 
mon to all of the scenarios that were identified as 
being efficient. As the amount of nonfederal land to be 
contributed increases, other geographic areas are also 
identified in some of the scenarios (Scenario 5A, rule 
set B; Scenario 5A, rule set D; Scenario 6A, Fig. 62.5). 
Thus, nonfederal contributions might emphasize the 
western and southwestern portions of the Peninsula, 
but other lands cannot be ruled out based on this 
analysis. All increments of habitat within these scenar- 
ios appear to provide positive benefits. 

Within the areas identified for nonfederal habitat 
contribution, the objective should be to retain habi- 
tat that is most concentrated and occurs in the largest 
blocks. The model suggests that home range areas 
containing less than 20 percent suitable habitat make 
a much smaller contribution to owl persistence than 
do areas with more habitat. Although this result is de- 
pendent on the structure and assumptions in the 
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model, it could be used as a management hypothesis 
and further tested through time. Application of this 
guideline to nonfederal habitat management would re- 
quire some mechanism other than the current take 
guideline. Habitat Conservation Planning might pro- 
vide such a mechanism, allowing for retention of habi- 
tat in configurations other than the simple mainte- 
nance of 40 percent suitable habitat within home 
ranges. Unless a specific mechanism is identified to 
provide the types of habitat configurations described 
here, the results of this analysis will have little value to 
the establishment of a 4(d) rule. If this analysis were 
used to locate areas for nonfederal contributions, but 
those areas were subsequently managed with the exist- 
ing take guidelines, the benefit to owls predicted here 
would not be realized. 

Specific results of these analyses should not be ex- 
trapolated to other geographic regions within the 


range of the northern spotted owl. Conditions on the 
Olympic Peninsula are unique, with a large block of 
federal habitat surrounded by nonfederal habitat. In 
other parts of the range, the relative contributions 
of federal and nonfederal habitat to owl persistence 
may differ from the patterns observed in this 
analysis. 
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Estimating the Effective Area of Habitat 
Patches in Heterogeneous Landscapes 


Thomas D. Sisk, Barry R. Noon, and Haydee M. Hampton 


yn a abundance and population viability have 
often been inferred via direct extrapolation from 
estimates of the area of suitable habitat. However, 
such estimates fail to account for changes in the spa- 
tial distribution of habitats such as those that may 
arise from fragmentation (e.g., Saunders et al. 1991; 
Wilcove et al. 1986; Faaborg et al. 1995). Conse- 
quently, it is now widely recognized that indirect esti- 
mates of animal abundance based on simple area met- 
rics, which do not include information about the 
spatial distribution of habitat, can be misleading. 
Added to the effects of spatial pattern of habitat is the 
recognition that source:sink dynamics (Pulliam 1988) 
and other subtle differences in habitat quality may re- 
sult in strong gradients in population density. 
Collectively, these insights indicate that inferences 
to animal abundance based on habitat assessments 
should be done at a landscape scale and that they 
should explicitly include information about spatial 
pattern and gradients in habitat quality (Beutel et al. 
1999). Recently added to the list of habitat factors 
known to affect animal abundance is variation in the 
character of the matrix that surrounds patches of 
habitat (Wiens 1996a,b, 1999; Gascon et al. 1999). 
Animal abundance, as well as survival and reproduc- 
tive rates, may be governed most strongly by land- 
scape-scale factors, including edge and matrix effects. 
This observation suggests that habitat quality is as 


much a product of landscape-scale factors as it is of 
proximate factors within a given habitat patch (For- 
man and Godron 1981; Temple 1986a; Murcia 1995; 
Gascon et al. 1999; Mesquita et al. 1999). As terres- 
trial landscapes become increasingly fragmented by 
human activities, a predictive approach that accounts 
for spatially extensive ecological factors in a spatially 
explicit context is needed to guide habitat manage- 
ment and conservation. 

The detrimental effects of changes in habitat area, 
spatial patterning, and landscape context are now 
widely accepted. However, the development of practi- 
cal tools to predict the effects of factors and design ap- 
propriate mitigation efforts has progressed relatively 
slowly. Temple (1986a) and Temple and Cary (1988) 
developed a *core area" model to explore possible ef- 
fects of forest fragmentation on interior-habitat bird 
species, and Laurance and Yensen (1991) extended the 
capabilities of the model and applied it to other frag- 
mentation scenarios (e.g., Laurance 1991). In this 
chapter, we present a spatially explicit landscape 
model that builds upon these and other efforts to ac- 
count for edge effects and patch context (i.e., matrix 
effects), given changes in landscape pattern and con- 
text. Our approach extends the Effective Area Model 
(EAM; Sisk et al. 1997), a spatial model that adjusts 
animal density estimates based on a species’ response 
to habitat edges. The model “works” by linking 
field data and remotely sensed imagery through a 


TES 
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geographic information system (GIS) interface with 
maps of actual landscapes. The eventual goal of our 
research is to develop a tool that will help managers 
select among alternative management actions, each 
generating a characteristic landscape configuration 
based on their effects on animal populations. 


Overview 


Many authors have emphasized the need for land- 
scape-scale assessments of the relationship between 
variation in animal abundance and variation in habi- 
tat pattern and quality (Forman and Godron 1986; 
Turner and Gardner 1991; Wiens 1995; Morrison et 
al. 1998). This need has been recognized for decades. 
For example, Sisk and Battin (in press) point out that 
Leopold’s discussion of edge effects (Leopold 1933) 
constitutes an early conceptualization of matrix effects 
and a call for greater attention to landscape-scale fac- 
tors influencing habitat selection. Since that early be- 
ginning, attempts to deal with edge, matrix, and other 
landscape-scale factors have received much attention, 
but primarily in a descriptive context. Sisk and Battin 
(in press) note that “edge effects” has become a poorly 
defined concept that is routinely employed to “explain 
away” unidentified or poorly quantified processes that 
influence measured attributes within the focal habitat 
patch. As habitats become increasingly fragmented, 
large, homogeneous patches that were formerly domi- 
nated by internal processes may become highly influ- 
enced by adjacent habitats and thus be strongly af- 
fected by processes external to the boundaries of the 
patch (Harris 1988; Wiens 1989a; Laurance 1991; 
Sisk and Battin in press) 

We believe that understanding patterns in animal 
abundance and population persistence in rapidly 
changing landscapes will require an explicit, process- 
oriented approach to address edge effects and the jux- 
taposition of patches within increasingly heteroge- 
neous landscapes. Such an approach requires methods 
for estimating the differing responses of animals to 
spatial patterning of their environment—that is, the 
type of habitat patch in which the animal is located, as 
well as the context of that patch within the larger 
landscape. Collectively, information about location 
and context, ccupled with empirical measures of dis- 


tribution, abundance, and demography, will allow us 
to predict various population attributes, such as ex- 
pected abundance or reproductive success, in changing 


landscapes. 


Effective Area Models: 
Scaling Issues and Spatial Variability 
in Animal-Habitat Relationships 


Wildlife biologists often struggle with scaling issues 
when attempting to model the relationships between 
animals and their habitats. The scale at which we col- 
lect information on species distribution, abundance, 
and fecundity, for example, are often much finer (typi- 
cally on the order of tens of hectares) than the scale at 
which we wish to draw inference (typically on the order 
of hundreds or thousands of hectares) (Wiens et al. 
1985; Withers and Meentemeyer 1999). Associated 
with this difference in scale is a difference in the degree 
to which spatial variability is captured and quantified 
(Wilkie and Finn 1996). Field studies typically employ 
study designs that use replication to capture between- 
site variability in response variables. However, logistic 
constraints limit the number of habitat types that can 
be studied with adequate replication. At the landscape 
scale, the number of habitat types and the variation 
among patches of similar habitat is often much greater 
than what can be measured in the field. Similarly, em- 
pirical studies usually must focus on only one or a few 
focal species, yet variation in responses among species is 
often large, making it difficult to generalize animal re- 
sponses to habitat fragmentation and other changes in 
landscape structure (Noss 1991; Sisk et al. 1997). 

The issues that arise with mismatched scales have 
led to two contrasting attitudes that are prevalent in 
animal ecology studies. The first, widely held among 
field biologists, emphasizes the variability in nature. 
Extensive observation often suggests that every study 
site and each species have unique characteristics and, 
therefore, the animal:habitat relationship cannot be 
generalized. For our purposes, this translates to the as- 
sumption that every species scales its environment 
uniquely, and developing a useful modeling approach 
for predicting the effects of landscape change would 
require a unique, data-based model for each species: 
habitat combination. For many ecologists, this is 
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tantamount to stating that modeling approaches are 
of limited utility. A second perspective emphasizes 
general patterns in animal:habitat responses and fo- 
cuses more on comparing likely outcomes rather than 
attempting to predict with great precision the re- 
sponses of individual species. For example, a suite of 
“habitat interior species” may be identified from field 
date or life-history characteristics, and a general 
model of their sensitivity to predators and parasites 
can be used to estimate the effects of habitat fragmen- 
tation (Temple and Cary 1988; Laurance 1991; 
Thompson 1993). This approach embraces the goal of 
most ecological models: to provide a useful simplifica- 
tion of complex phenomena in order to increase un- 
derstanding and predictive power. Of course, theorists 
and empirical ecologists often differ on what is a *use- 
ful simplification." 

Models that attempt to fully incorporate the fine- 
scale heterogeneity that characterizes most terrestrial 
landscapes, and the great variability and complexity of 
population responses occuring at local scales, often be- 
come so complex as to be of limited use to wildlife and 
resource managers (e.g., Malcolm 1994). In attempting 
to address directly the complexities of heterogeneous, 
dynamic landscapes, and the suite of animal responses 
to landscape-scale influences, we suggest two guiding 
principles of model development. First, the approach 
must be empirically based. Purely theoretical models 
often sidestep the critical problems of data collection 
and parameter estimation—particularly challenging is- 
sues when managing for multiple species. Elegant theo- 
retical models often provide useful insights but have 
limited application when real-world management is- 
sues arise. Second, useful management-oriented models 
should allow the user to compare the expected out- 
come of various management alternatives (e.g., Noon 
and McKelvey 1996b). Although optimization ap- 
proaches may prove useful in some applications, man- 
agers seldom have the luxury of selecting the optimal 
management alternative, even if it were possible to 
identify one a priori. Instead, managers often are faced 
with a constrained set of options, each of which may 
have different impacts on anima! populations, and the 
task is to select the best of the alternative actions. 
Therefore, management-oriented landscape models 
should allow the user to evaluate the alternative land- 


scape patterns projected to arise from the different 
management options and to select the alternative that 
“best” addresses the management objectives. From a 
decision-support perspective, a simple ecological model 
that will allow managers to correctly rank the relative 
values of competing alternatives is often more useful 
than a more elegant, complex model. There is often a 
pronounced trade-off between model utility and model 
complexity. If a simple model is adequate to discrimi- 
nate among and rank the outcomes from possible alter- 
natives, it obviously will be easier and cheaper to im- 
plement and thus be more likely to be used to help 
solve actual problems. 

Below, we describe a flexible modeling approach 
that is empirically based and designed to capture 
many of the influences that emerge from heteroge- 
neous landscapes and affect the distribution and abun- 
dance of animal populations. The model, as presented 
here, is parameterized by measuring the density re- 
sponse of target organisms as a function of their loca- 
tion in the landscape. The value of a population at- 
tribute, such as abundance, is affected by the quality 
of the patch in which the individuals reside as well as 
by the context of the patch within the landscape. Our 
approach is map-based and assumes that abundance 
can be characterized as a graded response dependent 
upon location relative to various habitat patches and 
patch boundaries. The term edges is used generically 
in our discussion to refer to abrupt patch boundaries 
that arise from many management activities. However, 
the model structure described below is flexible enough 
to include abrupt or gradual habitat transitions. 


Conceptual Approach 


The Effective Area Model (EAM) is a straightforward 
approach that combines field-based measures of 
species’ responses to habitat edges with landscape- 
scale habitat maps derived from remotely sensed data. 
The integration of these two sources of information 
allows us to predict variations in animal abundance 
across heterogeneous landscapes and to explicitly ac- 
count for the spatial context of habitat patches. The 
method incorporates among-species variability in re- 
sponse to landscape boundaries by relating multiple 
species’ responses to a common, classified habitat 
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map. Depending on the spatial resolution of the map, 
using a single map may result in some unavoidable 
“smoothing” of habitat relationships. That is, there is 
a danger of excluding some habitat attributes that are 
associated with a given species’ spatial distribution. 
However, we believe the EAM approach achieves the 
proper balance between model complexity and man- 
agement utility. 

By selecting a response variable (e.g., density, re- 
productive success) expected to vary with landscape 
heterogeneity, one can conduct the empirical research 
necessary for parameterizing the model. The associa- 
tion of the edge response function with a detailed 
habitat map combines a simplified biological response, 
adequately described in the response function, with 
the detailed spatial information available from re- 
motely sensed data and GIS technology. 


Taxonomic Foci 


Our objective is to develop a map-based modeling ap- 
proach that is practical for use as a management tool 
and useful in addressing the habitat needs of diverse 
species. Our current model parameterization efforts 
focus on birds and butterflies because these taxa repre- 
sent diverse life histories and ecologies, providing an 
appropriate set of species for model development and 
testing. Both taxa are highly mobile and, thus, are able 
to respond rapidly to changes in habitat quality. In ad- 
dition, we assume that their vagility allows them to se- 
lect habitats based on behavioral cues rather than on 
limitations imposed by dispersal ability. Both groups 
are rich in species and easy to identify and survey in 
the field, and there is abundant natural history infor- 
mation available to permit analysis of the links be- 
tween habitat selection and life history traits. To illus- 
trate our methods, we restrict our discussion in this 
chapter to birds of desert riparian ecosystems. Other 
systems (Sisk et al. 1997; Haddad and Baum 1999) 
and taxa (Sisk and Haddad in press) are considered 
elsewhere. 


Landscape Characterization 


A chief limitation of most spatial models in ecology is 
their failure to-capture key components of variability 


in landscape pattern—patch attributes such as size, 
shape, number, distribution, and spatial context. A 
typical modeling approach starts with a raster-based 
map of vegetation community types—that is, a classi- 
fication based on some combination of structural and 
compositional attributes (see, e.g., Avery and Berlin 
1992; Scott et al. 1996). Based on prior knowledge of 
the focal taxon, the map is categorized into habitat 
types by imposing a grain size and discriminating veg- 
etation communities on assumed differences in quality 
to the focal taxon. By use of some neighborhood rule, 
individual habitat cells are then aggregated to define 
patches of habitat. From that point forward, each 
habitat patch is assumed to be homogeneous, and in 
most cases the ecological dynamics that occur within 
the patch are assumed to occur in virtual isolation 
from the rest of the landscape, perhaps with a disper- 
sal rule connecting some patches genetically and/or 
demographically. In other words, the landscape-scale 
analysis is reduced to a series of exercises whose goal 
is to classify the landscape into patches in a parsimo- 
nious manner. Unless the spatial location of individual 
patches is explicitly considered, such as in some 
metapopulation models (e.g., Hanski 1994a), the 
models no longer contain explicit spatial information. 
Even when movement is simulated across these mosaic 
landscapes, the dynamics are restricted to simple func- 
tions that describe the likelihood of successful move- 
ment based on between-patch distances. 

The expected quality of a patch, however, is not 
just a function of its size and type but also of its land- 
scape context. The influence of the surrounding habi- 
tats on the quality of a focal habitat patch, often 
called the matrix effect, is an important but often- 
overlooked influence on animal demography (Forman 
and Godron 1986). Brittingham and Temple (1983) 
noted that proximity to agricultural habitats could 
lead to novel interspecific interactions in forest 
patches and the decline of songbirds, and others have 
pursued this issue in greater detail (e.g., Temple and 
Cary 1988; Thompson 1993; Faaborg et al. 1995; 
Robinson et al. 1995; Williams-Linera et al. 1998). 
Our approach to modeling the effects of landscape 
heterogeneity on animal abundance incorporates the 
effects of patch context into assessment of habitat 
quality. That is, we compute the effective area of a 
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habitat patch by adjusting estimates of animal abun- 
dance upward or downward to account for patch con- 
text and edge effects. Our efforts attempt to integrate 
consideration of within-patch habitat quality with the 
influence of surrounding habitats. 


Methods 


Collection of appropriate field data for model parame- 
terization depends on locating sites for bird surveys 
that are suitable for drawing inference over the mod- 
eled landscape. Where possible, we advocate selection 
of study sites that span habitat edges between all well- 
represented habitat types. In practice, this may involve 
an impractically large number of habitats and edge 
combinations, and the selection of study sites will 
have to be prioritized. In such cases, we prioritize edge 
types based on their commonness of occurrence in the 
modeled landscape and their sensitivity to proposed 
management activities. Common edges receive higher 
priority that rare edge types, and those that are most 
likely to be altered by management actions receive 
greater attention that those less sensitive to human ac- 
tivities. We have found that the number of edge types 
that are abundant in real landscapes is typically much 
smaller than the number of possible combinations of 
all habitat types. Based on field experiences in Califor- 
nia, Arizona, and Costa Rica, the number of edges 
that must be examined empirically rarely exceeds five, 
and in many cases it is lower. Although field efforts to 
parameterize edge responses of multiple species at up 
to five different edge types requires a large field effort, 
this typically is much less demanding than the require- 
ments for field data imposed by demographic models. 
In fact, we believe that the effort required to parame- 
terize the EAM is modest compared to most other em- 
pirically driven landscape models and that the neces- 
sary resources are likely to be available for many 
pressing habitat management issues. For example, in 
the application presented below from the San Pedro 
River in southeastern Arizona, we are focusing on 
three predominate edge types in a complex mosaic of 
desert riparian habitats. Funding limitations prevent 
the collection of empirical data across all possible edge 
combinations; however, these data gaps affect model 
predictions over only a small area of the landscape 


(see Figs. 63.7, 63.9 in color section). This field effort, 
while considerable, has been carried out by a four-per- 
son crew over two field seasons, a modest commit- 
ment of resources in comparison to several other on- 
going bird population studies in this watershed. 


Density Estimates 


We first established multiple survey points along tran- 
sects running orthogonal to transitions between differ- 
ent habitat patches. For each transect, we randomly 
selected a survey point along the habitat edge, then ex- 
tended the transect 200 meters into both of the ad- 
joining habitats. Transects spanned the full range of 
habitat types, with multiple sample points located 
within each habitat type at different distances from 
the patch boundary. At each survey point, we record 
the location (distance and direction from survey point) 
and identity of each bird detected. Because we expect 
the likelihood of detection to decrease with distance 
from the point, we used variable distance point sam- 
pling, recording the distance from each sample point 
to the detection. To estimate the probability of detec- 
tion (P,) out to some effective distance (w) from the 
survey point, these data are analyzed separately by 
species with program DISTANCE (Buckland et al. 
1993). Detection probabilities are estimated for each 
species in each major habitat type, and these estimates 
are used to adjust the number of individuals counted 
(n) at each survey point. Density at each survey point 
is estimated by 
A n 
D= = 

Finally, each survey point, with its corresponding den- 
sity estimate for species i (Dj), is given precise spatial 
coordinates using global positioning system (GPS) 
technology. 


Density Response Functions 


Each survey point that provides a density estimate for 
species i is assigned a habitat type label, and its spatial 
coordinates allow an estimate of the distance from the 
point to the patch boundary (edge). These data allow 
us to fit a regression of density (dependent variable) on 
distance from the edge (independent variable) 
for each edge type (see Fig. 63.1 in color section). The 
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Density Response Functions for Yellow Warbler 
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Figure 63.2. Species-specific responses to landscape hetero- 
geneity can be expressed as a family of edge response curves 
(see text). Here, we fit linear response functions to empirical 
data quantifying the yellow warbler (Dendroica petechia) re- 
sponse to four unique edge types in the riparian landscape. 
This family of curves describes the influence of the adjacent 
habitats on the quality of the three basic habitat types. 


methods used to estimate response functions and the 
terminology used to describe the functions are similar 
to Malcolm (1994). Various regression models can be 
assumed for species showing either declines or in- 
creases in abundance as edges are approached, and 
both linear and nonlinear regression models can be 
used. In practice, however, we have found that most re- 
sponse curves can be described adequately by a limited 
set of functions, including simple linear, exponential, 
half-normal, power, and second-degree polynomials. 

We consider the edge to be the origin in our regres- 
sion models—that is, distance 0 represents the edge. 
As a consequence, two separate response functions are 
estimated for each species for each edge type—the 
density response as location is moved from the edge 
into each of the adjoining habitat types (habitat type 
A and habitat type B). The regression models provide 
estimates of the density at the edge (y-intercept), the 
distance over which density is dependent upon dis- 
tance from the edge (dmax), the constant density that 
occurs beyond dmax = k, the basal density in a given 
habitat type). It is important to note that the two 
functions need not be mirror images of each other and 
that dmax and basal density may differ between adja- 
cent habitat types. An example of density response 
functions for a warbler breeding along the San Pedro 
River in southeastern Arizona is shown in Figure 63.2. 
(A. Brand unpublished data). 


Landscape Maps and 
Habitat Classification 


Defining and mapping the habitat types relevant to the 
focal taxon is a difficult and time-consuming task. 
Since this process has been described extensively in the 
published literature (e.g., Avery and Berlin 1992; 
Wilkie and Finn 1996; Turner and Gardner 1991) and 
elsewhere in this volume, we do not go into detail 
here. With the increase in availability of spatial data, 
such as gap analysis project (GAP; Scott et al. 1996) 
coverages and numerous sources for classified re- 
motely sensed data, users may opt to use previously 
prepared vegetation or habitat maps in the model. 

To create our landscape-scale maps, we relied ex- 
tensively on remotely sensed data, including Thematic 
Mapper Simulator (TMS) data. Researchers familiar 
with the site classified the habitat maps we use in our 
example, drawing on georeferenced TMS imagery and 
referring to hardcopy aerial photographs. We also use 
parametric and nonparametric classification algo- 
rithms, such as parallelepiped and minimum distance 
to means, which are available in off-the-shelf image 
processing software, for generating habitat data for 
the model. The particular technique chosen by the 
user will depend on several factors, including the 
availability of spatial data, digital-image processing 
capabilities, and expertise. Accuracy assessment of the 
resulting maps is itself an involved process and may 
vary from map product to map product. The effect of 
mapping error in habitat-relationship models is an 
area of active research (see Garrison and Lupo, Chap- 
ter 30; Gonzalez-Rebeles et al., Chapter 57; Karl et 
al., Chapter 51). In cases where model error has been 
partitioned among mapping error and error associated 
with the empirical estimation of population parame- 
ters, the latter has proven to be of greater importance 
(A. King personal communication). The number of 
case studies is, however, quite limited, and further in- 
vestigation is needed before general conclusions can 
be drawn about the relative importance of mapping 
error and error in population parameters. 


Model Structure 


In this raster-based spatial model, a species-specific 
density grid is created by evaluating the response func- 
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Model input: 


Habitat Spatial Data 
(Figure 63.4) 


Import habitat spatial data into Arc View GIS. 


CLOSEST HABITAT GRID 
Perform proximity analysis on habitat grid. 


EDGE HABITAT GRID 
Determine all pairs of adjacent habitats 

by combining habitat and closest habitat grids. 

(Figure 63.5) 


ANIMAL DENSITY GRID 
Apply animal density functions across habitat edges. 
(Figure 63.7) 


Model output: 


Average animal density and total number of individuals 


in each habitat and over any user-specified area 
(Tables 63.1 and 63.2) 


HABITAT GRID 
Convert habitat spatial data to ARC/INFO GRID format. 
(User specifies cellsize) 


Model input: 
EDGE RESPONSE CURVES 

User inputs a response function and maximum distance of 

edge influence for each pair of adjacent habitats. 


DISTANCE GRID 
Determine distance to closest habitat edge. 
(Figure 63.6) 


Figure 63.3. Effective Area Model flowchart. Key steps in model development are illustrated. They are de- 


scribed in greater detail in the text. 


tions relative to the nearest edge for each pixel in the 
habitat map. The number of individuals in any region 
or habitat type is then calculated from the density 
grid. The model was developed for ArcView GIS 
(ESRI, Redlands, Calif.) using the Spatial Analyst ex- 
tension and Avenue scripting language. The current 
developmental version has been tested under ArcView 
3.2 (Windows 95/98/NT) but should also work under 
ArcView 3.1. The step-down process we used to de- 
velop the EAM is summarized in Figure 63.3. 

The EAM requires two classes of model input: 


characterization of each species density response to 
habitat types and edges, and a detailed habitat map 
(see Fig. 63.4 in color section). The habitat map, por- 
trayed at a landscape scale, is developed from re- 
motely sensed and/or field data. Two decisions are 
made at this point: (1) the number and characteristics 
of habitat types to use in classifying the map, and (2) 
the size of the minimum mapping unit. Ín practice, 
these decisions are often constrained by data availabil- 
ity and the management objectives. It is important, 
however, that the habitat classes and minimum- 
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TABLE 63.1. 


Projected population size for three bird species, based on animal density grids (e.g., Figure 63.7). 


Null model Effective area model % Difference 
Species C/W2 Mes’ DSe Total C/W? Mes? DS‘ Total C/W2 Mes’ DSe Total 
Yellow warbler 
(Dendroica petechia) 
14.8 8.4 0108823:2 22.9 41.7 3.8 68.4 35 80 100 66 
Abert's towhee 
(Pipilo aberti) 
3:96 16:8 Pfs) PASS) 2319 152 4 SIOMEESET2 84 68 70 S 
Black-throated sparrow 
(Amphispiza bilineata) 
0.0% 31.9 26:4 58.3 1559535 T1 56 EEEOSO 100 10 ES 38 


Note: Numbers of individuals are broken down by habitat type as estimated by a null model ignoring edge effects and the Effective Area Model. 
Percent differences illustrate the important influences that edge responses may have in determining species abundances at the landscape scale. 


a8C/W = cottonwood/willow gallery forest. 
bMes = mesquite woodland. 
°DS = desert scrub. 


mapping unit be appropriate for the focal species with 
the finest-grained response to edge and matrix effects. 

In order to apply the density response curves, the 
model first determines which types of contiguous 
habitat exist and where they are located. This is ac- 
complished by combining the habitat map with a map 
of the closest habitat type. The resulting edge habitat 
map (see Fig. 63.5 in color section) divides the land- 
scape into classes of edge habitat. The model then 
prompts the user to input species-specific response 
functions and maximum distance of edge influence 
(dmax) for each type of edge habitat. For example, we 
found the yellow warbler (Dendroica petechia) to 
have a basal density of 0.38 individuals per hectare in 
the interior of the cottonwood/willow habitat, rising 
to 0.70 individuals per hectare along the boundary 
with mesquite (Fig. 63.2). This increasing trend begins 
100 meters from the habitat boundary. Based on pre- 
liminary data, we assumed this trend was linear, but 
nonlinear response curves can also be used. 

Once the model locates and characterizes the edge 
habitat classes, a distance-to-edge grid (see Fig. 63.6 in 
color section) is created to supply distance values for 
the density response curves. The model then projects 
values from the appropriate density response curves to 
their respective edge habitats, forming a grid of den- 


sity values for a given species (see Fig. 63.7 in color 
section). The EAM’s adjustment for edge effects can 
significantly change population estimates. For exam- 
ple, when edge effects were included, the predicted 
population size of yellow warblers in the study area 
increased by 66 percent over the null model (Table 
63.1). Because the yellow warbler is an edge exploiter 
at each edge type, its estimated population size is 
higher in each habitat when edge effects are consid- 
ered. In this example, the Abert’s towhee (Pipilo 
aberti) and black-throated sparrow (Amphispiza bilin- 
eata) exhibit different edge-exploiting behavior, influ- 
encing predicted population sizes (Table 63.1). 

As noted above, time, cost, and other constraints 
are likely to limit the amount of field data on edge re- 
sponses that will be available or can be collected 
across the model landscape in a timely manner. Vari- 
ous assumptions regarding animal density surround- 
ing these edges can be specified in the EAM. Here we 
assume that densities near edges for which we have no 
empirical edge response are equivalent to the values of 
the interior habitats. That is, the EAM reverts to a 
null model that ignores edge and matrix effects in lo- 
cations where edge responses are not quantified. In 
our example, it is assumed that edge response curves 
are not available for the habitat boundary between 
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cottonwood/willow and desert scrub habitats. Thus, 
the null model and EAM return the same animal den- 
sity values adjacent to these edge types. 

Although many sources of error influence EAM 
output, we believe that the most important potential 
source of prediction error is associated with the empir- 
ical estimation of edge response functions. We are cur- 
rently examining two approaches for incorporating 
error estimates into the EAM. One approach assumes 
that the dependent variable (e.g., animal density) is 
normally distributed given any value of the independ- 
ent variable (i.e. distance to the habitat edge) and cal- 
culates expected error based on the mean squared 
error. Àn alternative approach employs Monte Carlo 
methods to sample from the empirically derived distri- 
bution of the response variable and constructs confi- 
dence intervals for estimates at various distances from 
the edge. Ongoing efforts to test and refine these 
methods will lead to incorporation of error estimation 
algorithms into the first version of the model to be 
widely distributed to managers. 


Application: Assessing the Impacts of 
Habitat Change on Birds in a Desert 
Riparian Ecosystem 


The San Pedro River in southeastern Arizona is the 
last free-flowing river in the Southwest. It supports 
very high species diversity for many taxonomic 
groups, and it is an important seasonal habitat for 
Neotropical migrant birds (Skagen et al. 1998). The 
San Pedro is a threatened river due to increased de- 
mand for water to supply agriculture and rapid urban 
growth. Groundwater depletion may have dramatic 
effects on riparian vegetation leading to the decline in 
recruitment of riparian flora and increased mortality 
of mature vegetation (Stromberg et al. 1992, 1996; 
Auble et al. 1994). Excessive pumping of groundwater 
resources has led to the degradation and de-watering 
of arid streams and springs worldwide (Gremmen et 
al. 1990), threatening rare riparian habitats and com- 
promising the viability of species dependent upon 
them. The conflict over groundwater pumping and in- 
stream flow of the San Pedro River, and associated im- 
pacts on riparian habitats and the biological diversity 


they support, is one of the most controversial in the 
nation, involving county, state, and international law. 

Change in habitat type and landscape composition 
has been prevalent over the past several decades on 
most desert rivers, and the San Pedro is no exception. 
The first European explorers described the San Pedro 
watershed as an expansive wetland system, with me- 
andering small streams and only sparse woody vegeta- 
tion (Hendrickson and Minkeley 1984). In the early 
twentieth century, channel incision and the establish- 
ment of cottonwood-willow gallery forest along the 
river channel dramatically changed the vegetation of 
the riparian corridor (Hastings and Turner 1980). 
Mining activities placed increased demands on limited 
water resources in southeastern Arizona during the 
first half of the twentieth century, and rapid urban ex- 
pansion over the past twenty years has dramatically 
increased groundwater pumping rates. Water extrac- 
tion has lowered the water table in some locations, 
threatening cottonwood-willow, mesquite woodland, 
and grassland habitats that compose the corridor of ri- 
parian habitat that occupies the river floodplain. As 
demands for groundwater continue to increase, shifts 
in habitat type and in the spatial configuration of 
patches of riparian vegetation become more likely 
(Auble et al. 1994; Stromberg et al. 1996), and the 
fate of the riparian habitats that support the majority 
of the region's biodiversity hangs in the balance. The 
complex relationship between hydrology, vegetation, 
and biodiversity—poorly understood and laced with 
scientific uncertainty—has emerged as a controversial 
management issue. 

We used a prototype of the EAM to examine avian 
responses to habitat heterogeneity and landscape 
change along a 1.5x3-kilometer section of the upper 
San Pedro River. We tested the ability of the model to 
capture edge and matrix effects by contrasting predic- 
tions with those of a null model that ignores edge ef- 
fects and uses a single mean density to predict animal 
abundance in each habitat patch. We also demonstrate 
use of the model to predict the consequences of habi- 
tat change (in this case habitat type conversion from 
more mesic to more xeric types) resulting from a hy- 
pothetical drawdown of the aquifer maintaining flows 
in this particular segment of the San Pedro River. 
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Drawdown Scenario 


The riparian vegetation along the river corridor stands 
in stark visual contrast to the neighboring desert scrub 
habitat that predominates across much of the sur- 
rounding region, where the water table is too deep for 
the establishment of woody riparian species. 

Cottonwood/willow gallery forest dominates along 
the perennial stretches of river, while mesquite bosque 
and Sacaton grasslands occupy more xeric sites in the 
river floodplain. The edges between these habitat 
types tend to be sharp and obvious in this hydrologi- 
cally driven landscape structure (Fig. 63.4). In the hy- 
pothetical scenario that we simulated for the purpose 
of demonstrating the EAM, we assumed a simple type 
conversion from more mesic to more xeric habitat 
types following a significant drawdown of the water 
table. Approximately 10 hectares of cottonwood/wil- 
low habitat are converted to mesquite bosque, and 40 
hectares of mesquite are converted to desert scrub (see 
Fig. 63.8 in color section). Although the response of 
vegetation to hydrological change is certain to be 
more complex, similar landscape-scale changes have 
been observed in some desert riparian habitats follow- 
ing water diversion or prolonged drought. 

Using the methodology described above, we created 
a habitat map for the drawdown scenario and gener- 
ated new density grids for each of three bird species— 
yellow warbler, Abert's towhee, and black-throated 
sparrow. This allowed us to estimate the effects of 
simulated habitat change by comparing existing con- 
ditions with those following the drawdown. 


Assessing the Importance of 
Edge and Matrix Effects 


Comparisons of the EAM and the null model suggest 
that the influences of habitat edges on avian distribu- 
tion and density can be large in heterogeneous land- 
scapes such as desert riparian systems. Where species 
show strong affinity for particular habitat edges, such 
as was seen for the yellow warbler at the cotton- 
wood/willow/mesquite edge (Fig. 63.1) the resulting 
differences in predicted avian densities in the adjoin- 
ing habitats may be quite large. For example, the 
EAM predicts a density of 22.9 yellow warblers per 


hectare in cottonwood/willow habitat within the study 
areas compared to 14.8 birds per hectare in the null 
model (Table 63.1). Mean density of this species in 
mesquite habitats is expected to be 41.7 birds per 
hectare, compared to only 8.4 birds per hectare in the 
null model. For other species, the contrast between 
predictions of the EAM and null model are less pro- 
nounced but still large enough to suggest that consid- 
eration of edge responses in heterogeneous riparian 
landscapes is important when assessing habitat quality 
and avian distributions (Table 63.1). 

Model results for this hypothetical scenario 
demonstrate possible applications of the Effective 
Area Model for predicting the outcomes of alterna- 
tive management scenarios. Although it is inappro- 
priate to attempt to extract detailed management les- 
sons from this hypothetical scenario that is based on 
preliminary field data, several salient points emerge 
from this exercise that reflect on the utility of the 
EAM approach, in general. 

First, model results demonstrate that the EAM is 
sensitive to habitat fragmentation, as the amount, 
type, and location of habitat edges change. Given re- 
alistic species-specific edge responses, the EAM pro- 
vides a rapid, automated means of assessing the si- 
multaneous shifts in species abundance, or other 
edge-sensitive response variables, given a relatively 
small set of clear assumptions regarding habitat use 
and landscape structure. Figure 63.9 (see in color 
section) illustrates how changes in subsurface hydro- 
logical properties may impact animal populations by 
driving complex changes in the amount of available 
riparian habitat and the spatial distribution and jux- 
taposition of habitat patches. Clearly, the relation- 
ship between hydrological processes and vegetation 
change lie at the heart of any predictions of avian re- 
sponses. Such modeling efforts are currently under- 
way along the San Pedro River (e.g., Stromberg et al. 
1996), and the possibility of linking the EAM to hy- 
drological models is an intriguing concept that may 
soon be tractable. 

Finally, it is apparent from our analysis of the 
drawdown scenario that differences between EAM 
and null model predictions were less in the original 
landscape than for the drawdown scenario (Tables 
63.1 and 63.2). Since more complex landscape struc- 
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TABLE 63.2. 


Projected population sizes for three bird species before and after habitat 
type conversion resulting from a hypothetical aquifer drawdown scenario 
along a desert riparian river. 


Effective area 96 
Null model model Difference 
Yellow warbler (Dendroica petechia) 
Initial landscape 29 68.4 195 
Following habitat type conversion T9 48 168 
Predicted change -5.3 -20.4 
Abert's towhee (Pipilo aberti 
Initial landscape 29 85.2 266 
Following habitat type conversion 19.9 65.7 230 
Predicted change -3.4 -19.5 
Black-throated sparrow 
(Amphispiza bilineata) 
Initial landscape 58.3 93.6 61 
Following habitat type conversion 56.8 109.4 93 
Predicted change -1.5 15:8 


Note: Type conversion increases the relative abundance of edge-influenced habitat, 
increasing the differences between estimates from the EAM and null models. 


tures generally contain more edge (Sisk and Margules 
1993), this result is not unexpected. However, it sug- 
gests a more general rule: that the consideration of 
edge effects becomes more important as landscape 
complexity increases. 


Discussion 


A number of mechanistic explanations for edge effects 
on birds have been demonstrated, and additional 
mechanisms have been proposed. These include in- 
creases in parasitism and predation rates (e.g., Robin- 
son et al. 1995; Arango-Velez and Kattan 1997), 
changes in microclimate (Kuitunen and Makinen 
1993), and changes in the distribution of food re- 
sources resulting from changes in vegetation structure 
or cross-boundary subsidies (e.g., Restrepo and 
Gomez 1998). Because a diversity of mechanisms may 
be operative at patch edges, and because of great dif- 
ference in species’ life histories, responses to these fac- 
tors range from positive, to neutral, to negative. Fur- 
thermore, the response of a particular species may 
differ at different types of habitat edges (Sisk and 


Haddad in press). Despite this complexity in edge re- 
sponses, the EAM approach has proven practical in 
several studies of animal:habitat relationships (see Sisk 
and Haddad in press), and the EAM performed signif- 
icantly better than a null model ignoring edge effects 
when predicting the abundances of breeding birds in 
patches of oak woodland habitat in coastal California 
(Sisk et al. 1997). 

The EAM does not explicitly incorporate the 
mechanisms that give rise to edge effects. Rather, the 
consequences of edge effects are assumed to be ex- 
pressed in some demographic variable, such as den- 
sity or reproductive success. In addition, the value of 
this variable is assumed to vary continuously as a 
function of distance from a habitat edge. Though 
some ecologists may view the EAM approach as defi- 
cient because it is more phenomenological than 
mechanistic, we believe the model structure can be 
defended on purely pragmatic grounds. The realities 
of conservation planning and multispecies manage- 
ment preclude the luxury of developing detailed 


mechanistic models for most species. 
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Management Implications 


Which habitats are most important for sustaining bio- 
logical diversity? What management activities can be 
implemented to improve the quality of existing habi- 
tats for species of concern? These are common, funda- 
mental questions frequently asked of researchers by 
land and wildlife managers (see, e.g., Dasmann 1981; 
Forman and Godron 1986). Although inherently diffi- 
cult to answer, these questions are unfortunately be- 
coming more urgent and intractable as a consequence 
of rapid rates of habitat loss and changes in land 
cover. The result is a displacement of ecological sys- 
tems outside their historic ranges of variability, a con- 
sequence of introducing novel conditions of land use, 
water diversion and disruption of hydrological cycles, 
shifting climate, and other forms of global and re- 
gional change. The significance of a landscape ecolog- 
ical perspective is widely acknowledged by land man- 
agers. However, at the same time that managers are 
discovering the relevance of concepts such as land- 
scape pattern, heterogeneity, connectivity, and edge ef- 
fects, they are discovering that few practical tools have 
been developed for linking landscape-scale ecological 
theory with practical management issues. 

The EAM is an attempt to estimate the effects of 
habitat transitions and boundaries on the distribution 
and abundance of mobile organisms in heterogeneous 
landscapes. Predictions from our prototype model 
suggest that the responses of species to habitat edges 
may exert profound influence on such landscape-scale 
patterns. These effects may be particularly pro- 
nounced in the desert riparian ecosystems explored 
here because they are naturally linear, heterogeneous, 
disturbance-prone, and characterized by abrupt habi- 
tat edges. However, the differences between predic- 
tions of the EAM and a null model that excludes edge 
effects suggest that edges should not be ignored in 
other landscapes when assessing the effects of land- 
scape pattern and heterogeneity on distribution and 
abundance. 

Although animal abundance may not always be a 
reliable predictor of population status (Van Horne 
1983; Pulliam 1988), we believe that it is often the 
only tractable response variable when many species 
are of interest and more-detailed demographic ap- 


proaches are impractical. Where demographic data 
permit detailed analysis of population viability, the 
EAM can provide important refinements in the assess- 
ment of habitat quality in mapped habitat patches. 
Furthermore, the model structure of the EAM is suffi- 
ciently general to allow the modeling of any ecological 
process or variable that can be represented as an edge 
response, that is, as a continuous function across an 
ecological gradient running orthogonal to a habitat 
edge. We are currently using the same model structure 
to address microclimatic variation and butterfly abun- 
dance across forest structural types in ponderosa pine 
forests on the Colorado Plateau (Meyer and Sisk in 
press) and to model avian nest productivity across het- 
erogeneous forest and riparian ecosystems (J. Battin, 
A. Brand unpublished data). Given the flexibility of 
the EAM in modeling edge-related processes at the 
landscape scale, we believe that the model will provide 
a versatile new tool for assessing animal:habitat rela- 
tionships at the landscape scale. 

As discussed above, a significant limitation to wide 
application of the EAM approach is its requirement 
for robust edge-response data for all species of man- 
agement interest. Although the cost and time invest- 
ment for obtaining the relevant empirical data are 
much less than those of intensive demographic ap- 
the northern spotted owl (Strix occi- 
dentalis caurina) research program, see Forsman et al. 
1996a; Noon and McKelvey 1996a), field efforts re- 
quired for estimation of edge responses for rare taxa 
may, nevertheless, be significant. Several lines of evi- 
dence, however, suggest that efficiencies in data acqui- 


proaches (e.g., 


sition may arise. For example, multi-taxon sampling 
by the co-location of survey points is recommended. 
(See Fig. 63.1 for edge responses of three of several 
dozen species sampled along the San Pedro River, 
using the methodology described above.) In addition, 
preliminary data suggest that density response func- 
tions are restricted to a small number of possible 
shapes. An active area of research is to determine if 
the shape of edge response curves can be inferred as a 
general function of taxonomy, life history attributes, 
habitat types, and landscape characteristics. 

In its prototype stage, described here, the EAM 
links field and remotely sensed data in a landscape 
model that permits comparison of the effects of land- 
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scape structure on wildlife populations. In this con- 
text, we believe the EAM provides decision makers 
with a useful tool for comparing the impacts of alter- 
. native land management plans. 
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Demographic Monitoring and the Identification 
of Transients in Mark-recapture Models 


M. Philip Nott and David F. DeSante 


Mo for monitoring bird populations fall 
into two categories: population monitoring 
and demographic monitoring. Population monitor- 
ing includes techniques such as point count surveys 
and area searches that can provide inferences regard- 
ing species richness and abundance. Over time these 
techniques can detect change in population size but 
may not be able to identify whether the cause(s) of 
that change are associated with birth or’ death 
processes if they do not discriminate between breeding 
and nonbreeding individuals. The demographic causes 
of population change may be identified by demo- 
graphic monitoring techniques, including banding and 
nest monitoring, that provide inferences regarding 
productivity and survivorship parameters. Accurate 
estimates of these parameters are essential to the con- 
struction of predictive population models. Constant- 
effort mist netting provides spatially explicit estimates 
of survival rates from mark-recapture modeling of 
banding data (Buckland and Baillie 1987; Rosenberg 
et al. 1999) and indices of productivity obtained from 
the ratio of young to adult birds captured (DeSante 
1992; DeSante et al. 1995; Peach et al. 1996). This 
technique provides both within-year and between-year 
information, allowing modifications to be made to 
mark-recapture models that consider the existence of 


transient individuals and provide estimates of demo- 


graphic parameters for the resident proportion of the 
population. 

Pradel et al. (1997) proposed a method for identify- 
ing transients and adjusting survival rate estimates ac- 
cordingly. This method is incorporated into a modifi- 
cation of the Cormack-Jolly-Seber mark-recapture 
model (Cormack 1964; Jolly 1965; Seber 1965) called 
TMSURVIV (Pradel et al. 1997). This model provides 
estimates of the proportion of residents (7) in addition 
to estimates of survival rate (@) and recapture proba- 
bility (p). Although this method allows that a bird 
caught in only one year may be a resident, based on 
the probability of between-year recaptures being less 
than unity, it ignores within-year information inherent 
in banding data that is derived from constant effort 
mist-netting. Analyzing these data requires a more so- 
phisticated mark-recapture model that considers a 
length-of-stay criterion for individual birds (Pradel et 
al. 1997). This is provided by LOSSURVIV, a recent 
modification by TMSURVIV and J. D. Nichols and 
J. Hines. 

-In this chapter, we describe the flexibility of con- 
stant-effort mist-netting data and outline methods by 
which the accuracy of estimating demographic param- 
eters can be improved. We utilize banding data from 
ten bird species (three temperate-wintering, three 
temperate-tropical, and four tropical-wintering spe- 
cies) captured in the northwestern United States at 
thirty-six constant-effort banding stations operated by 
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the Institute for Bird Populations as part of the Mon- 
itoring Avian Productivity and Survivorship (MAPS) 
program (DeSante 1992; DeSante et al. 1995). We 
analyze banding data for the period 1992 to 1998 to 
provide temporal patterns of the resident adult, tran- 
sient adult, and juvenile portions of the population. 
For each species, we compare these trends with those 
obtained from population monitoring data for the 
western states provided by the North American 
Breeding Bird Survey (BBS; Peterjohn et al. 1995). 
We compare estimates of adult survival rates, recap- 
ture probabilities, and proportions of residents ob- 
tained from three different mark-recapture models 
(Pollock et al. 1990; Lebreton et al. 1992). These in- 
clude LOSSURV, a recent modification of SURVIV 
(White 1983, 1986) that considers individuals cate- 
gorized as transients or residents according to a 
length-of-stay criterion. Finally, we explore the rela- 
tionship between survival rate and productivity as a 
function of migratory strategy with respect to ac- 
cepted life-history theory for temperate, temperate/ 
tropical, and tropical migrants. 


Methods 


The Monitoring Avian Productivity and Survivorship 
(MAPS) program collects data from over five hun- 
dred field stations across the North American conti- 
nent (DeSante et al. 1998, 1999). This program 
adopts a “constant-effort mist-netting” protocol to 
provide survivorship estimates and productivity in- 
dices for passerines. Typically, at each station, bird- 
banding teams operate ten mist nets located within 
the central 8 hectares of a 20-hectare study plot for 
six hours following sunrise. Each station is visited on 
one day within sequential ten-day periods through- 
out the breeding season (May to August) up to a 
maximum of ten periods. Normally, six stations con- 
stitute a MAPS location, which represents a monitor- 
ing effort in a national forest, national park, or other 
managed land area. The protocol assumes that cap- 
tures include adults and young from both the vicinity 
of the monitoring station early in the breeding sea- 
son and from the surrounding landscape later in the 
season as breeding activity ceases and the birds begin 
to disperse. 


TABLE 64.1. 


Common names, scientific names, and Breeding Bird 
Laboratory (BBL) abbreviations for ten bird species 
represented by over five hundred individuals in the MAPS 
database for USDA Forest Service Region 6. 


BBL 
Common name Scientific name abbr. 
Western flycatcher Empidonax difficilis WEFL 
and E. occidentalis 

Winter wren Troglodytes troglodytes WIWR 
Swainson’s thrush Catharus ustulatus SWTH 
Yellow-rumped (Audubon’s) Dendroica coronata 

warbler audubonii AUWA 
Townsend’s warbler Dendroica townsendi TOWA 
MacGillivray’s warbler Oporornis tolmiei MGWA 
Wilson’s warbler Wilsonia pusilla WIWA 
Song sparrow Melospiza melodia SOSP 


Melospiza lincolnii LISP 
Junco hyemalis oregonus ORJU 


Lincoln's sparrow 
Dark-eyed (Oregon) junco 


Here, we consider a group of thirty-six stations 
that allow us to monitor demographic parameters in 
forested lands under the stewardship of the USDA 
Forest Service Region 6, which covers the Pacific 
Northwest region of the United States. Specifically, 
the locations included are Mount Baker National 
Forest and Wenatchee National Forest in the state of 
Washington and Willamette National Forest, Siuslaw 
National Forest, Umatilla National Forest, and Fre- 
mont National Forest in the state of Oregon. These 
forests are typically heavily managed and share many 
plant, animal, and bird species. Data were pooled for 
ten target species (top ten species ranked by total 
number of captures in each case, represented by over 
five hundred individuals captured) captured at these 
stations to provide regional survival rate estimates for 
adult birds. 

We selected banding data for the ten most-captured 
species (Table 64.1) during the seven-year period from 
1992 through 1998 inclusive. Dates of operation vary 
by station dependent upon latitude and elevation. We 
only considered captures made between MAPS peti- 
ods 4 and 10 representing the seven ten-day periods 
between May 31 and August 8 that were common to 
all stations. 
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Demographic Groups and Indices 
of Productivity 


Birds that are caught and banded belong to one of 
three demographic groups—residents, transients, and 
juveniles. Early in the breeding season the catch in- 
cludes philopatric individuals returning to breeding 
territories within the boundaries of the banding sta- 
tion, many of which are caught year after year. The re- 
maining portion of the early-season catch is made up 
of transient individuals passing through the station on 
their way to distant territories, or seeking habitat in 
which to establish new territories. In the middle of the 
breeding season the catch consists of resident breeders 
(whose activity space includes a mist net location) and 
unpaired adults (floaters) that may be queuing for 
available territories or passing through in search of 
otherwise unoccupied breeding habitat. Although 
some individuals are only caught in one year, they may 
be caught more than once in that year and could be 
considered as resident birds. Later in the season, as 
young birds fledge and adult territoriality relaxes, the 
catch includes dispersing juveniles and adults from the 
surrounding area. 

For each species, we constructed temporal activity 
patterns by categorizing individuals as resident, tran- 
sient, or juvenile individuals according to their capture 
histories: 


1. Adults Seen Once (ASO). Transient adults include 
those individuals captured once, and once only, 
and those individuals caught in only one year but 
more than once within a period of less than seven 
days. 

2. Between-Year Residents (BYRES). Those individu- 
als caught in at least two different years. It is as- 
sumed that the probability of any these individuals 
are merely passing through the site and are caught 
in more than one year is very low. 

3. Within-Year Residents (WYRES). Those individu- 
als caught in only one year but caught more than 
once in that year. They are only classified as 
WYRES if the maximum “length-of-stay” between 
captures exceeded six days; otherwise, they were 
assigned to the ASO category. 

4. Known Residents (KNRES). Those individuals 
classified as BYRES or WYRES; we assume these 


birds represent the group that are resident at the 
monitoring site. 

5. Young (YNG). Those individuals identified in the 
database as juvenile birds that may have come 
from the site or from the surrounding landscape. 


For each species, we estimated linear temporal 
trends for the annual numbers of all adults, known 
residents, adults seen once, and juveniles. We indexed 
productivity as the proportion of young in the catch 
both annually and as a mean annual index for the 
whole period. We recorded the frequency of captures 
of birds (pooled across years) in each category by ten- 
day period, calculated the abundance of each category 
as a proportion of the total number of captures, and 
plotted the results as a histogram for each species. 
Furthermore, to look at the diurnal patterns of activ- 
ity, we plotted the frequency of captures by hour after 
sunrise. Henceforth, we refer to these histograms as 
“seasonal capture profiles” and “hourly capture pro- 
files,” respectively. 


Comparing Population Trends in MAPS 
with Breeding Bird Survey Data 


The North American Breeding Bird Survey (BBS) pro- 
vides trends of the numbers of birds of all species seen 
and heard at a number of stops along routes distrib- 
uted across North America (Peterjohn et al. 1995). We 
obtained the regional BBS abundance trends (James et 
al. 1996; Link and Sauer 1998) for ten target species 
(Table 64.1) of British Columbia, Washington, Ore- 
gon, and California for the period 1992-1998. For 
each species, we compared these trends with corre- 
sponding trends in the number of adults calculated 
from MAPS data. 


Estimates of Survivorship and Transience 


Birds caught only once may belong to any of the three 
demographic groups. Adults may be transient individ- 
uals, residents that died or left the area by the next 
year, or residents caught in the latest year of banding 
that may be caught in future years. The transient indi- 
viduals cause survivorship estimates to be biased low 
in closed-population models. Pradel et al. (1997) ap- 
proached the problem of identifying transients using 
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an ad hoc approach to produce unbiased estimates of 
resident proportions that effectively ignores the first 
year of capture for all individuals. This approach is in- 
corporated into TMSURVIV, which produces esti- 
mates of survivorship and capture probabilities for 
residents, as well as proportions of residents. This ap- 
proach, although unbiased with regard to the recap- 
ture probabilities, does not take into account the im- 
portant “within-year” information that can indicate 
an individual captured in only one year is a resident 
and not a transient. Pradel et al. (1997) suggest that 
the critical parameter in identifying transients is the 
length-of-stay period and that transients can be de- 
tected with a suitable study design in which the inter- 
val between capture sessions exceeds that of the maxi- 
mum length-of-stay of a transient individual. 
Constant-effort banding studies represent a sampling 
design appropriate for detecting transients using a 
length-of-stay method because banding sessions can 
be separated by an interval that exceeds the period of 
time a transient might be expected to stay in the area. 

For each species, we constructed capture histories for 
all adult birds. We obtained time-constant estimates of 
adult survival probability (9), recapture probability (p), 
and proportion of residents (7) by entering the capture 
histories into three different mark-recapture models. 
The first and oldest model, SURVIV (White 1983), as- 
sumes a closed population and does not distinguish be- 
tween transient and resident individuals. The second 
model, TMSURVIV, is a modification of SURVIV that 
provides an unbiased estimate of the resident propor- 
tion based on between-year information (Pradel et al. 
1997). The final model, LOSSURVIV, is a modification 
of TMSURVIV that considers additional within-year in- 
formation. This model requires categorizing capture 
histories as (1) unmarked individuals caught only once 
in the first year of capture, regardless of how many 
times they were caught in subsequent years, and (2) 
marked individuals caught more than once seven or 
more days apart in their first year of capture. We 
processed these capture histories for each species 
through a number of permutations of time-constant 
and time-independent (with respect to @, p, and 7) sub- 
models to ascertain whether the time-constant @pt sub- 
model represented the best (or equivalent) submodel for 
estimating demographic parameters. We compared and 


contrasted the values of demographic parameters result- 


ing from these models. 


The Relationship between 
Survival Rate and Productivity 


DeSante et al. (1998) presented evidence for the exis- 
tence of the trade-off between survival rate and pro- 
ductivity suggested by Martin (1995) and found the 
relationship to be a function of longitude and migra- 
tion strategy. The underlying theory is that the longer 
a species lives the fewer offspring it needs to produce 
to maintain stable population levels. Avian migration 
to climatically more stable tropical wintering areas 
may lead to higher overwintering survival rates rela- 
tive to those of temperate-wintering species. On the 
other hand, temperate-wintering species can exploit 
available breeding habitat sooner than tropical-win- 
tering species and potentially produce more clutches. 
All else being equal, survival rates and productivity in- 
dices should correlate negatively and be a function of 
migration strategy. To explore this relationship, we 
plotted the relationship between survivorship and pro- 
ductivity for the ten species of this study. 


Results 


Seasonal capture profiles (Fig. 64.1) show the number 
known resident individuals (KNRES = BYRES + 
WYRES), individual adults seen once only (ASO), and 
individual young birds as a function of sequential vis- 
its (periods) to the stations throughout the breeding 
season. Typically, the peak of KNRES captures occurs 
in period 6 or 7 corresponding to late June or early 
July, with the exception of dark-eyed juncos (Junco 
hyemalis) for whom the peak capture period occurs in 
period 4 at the beginning of June. Note that the sea- 
sonal capture profiles of KNRES include, for many in- 
dividuals, multiple captures of the same birds. The 
peak of ASO captures relative to the peak of resident 
captures varies across species and occurs in the same 
or a later period, never earlier. The ASO profile for the 
western flycatcher (Pacific-slope flycatcher [Empi- 
donax difficilis], Cordilleran [E. occidentalis]) takes a 
sudden jump from fifty to eighty individuals in period 
7 whereas that for the dark-eyed junco (Junco bye- 
malis) decreases gently over the entire season. 
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Figure 64.1. Seasonal capture profiles of the number of individuals caught in mist nets by ten-day periods be- 
tween May 31 and August 8. These data are pooled from thirty-six banding stations across a seven-year pe- 
riod (1992-1998). Six species are represented: MacGillivray's warbler (MGWA), Swainson's thrush (SWTH), 
western flycatcher (WEFL), song sparrow (SOSP), Lincoln's sparrow (LISP), and dark-eyed (Oregon) junco 
(ORJU) Individuals are categorized into three groups, and second-order polynomials are fitted to each set of 
histogram bars. These categories are known residents (black bars, solid line), adults seen once only (gray 
bars, dashed line), and young birds (white bars, dotted line). 


Although the date varies by year, the appearance of 
young first occurs in period 6 or 7 and generally in- 
creases to a peak in periods 9 and 10 for all species ex- 
cept song sparrow (Melospiza melodia) and Lincoln's 
sparrow (Melospiza lincolnii) for which it occurs in 
periods 9 and 8, respectively. 

The hourly capture profiles depicted in Figure 64.2 
also show differences between species but not across 
demographic groups within a species. Generally, the 
proportion of total captures decreases as the day pro- 
gresses. Approximately 45 percent of MacGillivray’s 
warbler (Oporornis tolmiei), Swainson's thrush 
(Catharus ustulatus), and song sparrow captures 
occur in the first two hours (peaking in the second 
hour), followed by a gradual decline in the hourly cap- 
ture proportion thereafter. Lincoln's sparrow captures 


occur mainly in the first three hours (60 percent of 
total) and decline more rapidly thereafter. Hourly cap- 
ture profiles for western flycatcher and dark-eyed 
junco show a more even distribution except that the 
proportion of western flycatcher captures made in the 


first hour is very low. 


Population Trends in Maps and 
Breeding Bird Survey Data 


Six of the ten species show a negative trend in the 
total number of adults caught during the period 
1992-1998 (Table 64.2) but only Townsend's war- 
bler (Dendroica townsendi) shows a significant trend 
(P « 0.05). Generally, these trends are reflected in the 
trends reported for the KNRES and ASO portions of 
the captures as well for the annual numbers of young 
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Figure 64.2. Hourly capture profiles of the proportion of individuals caught in mist nets in each of seven hours 
after sunrise. These data are pooled from thirty-six banding stations visited annually between May 31 and August 
& across a seven-year period (1992-1998). Six species are represented: MacGillivray’s warbler (MGWA), Swain- 
son's thrush (SWTH), western flycatcher (WEFL), song sparrow (SOSP), Lincoln's sparrow (LISP), and dark-eyed 
(Oregon) junco (ORJU). Individuals are categorized into four groups: all adults (black bars), adults seen once only 
(dark gray bars), known residents (light gray bars), and young birds (white bars). 


captured. The increases in the total number of adult 
song sparrows and winter wrens (Troglodytes 
troglodytes) appears to be associated with significant 
increases in the number of adults only seen once. 
Similarly, the decline in western flycatcher adults is 
associated with a significant decline in the numbers 
of adults seen once. Conversely, the declines in 
Townsend’s warbler and dark-eyed junco adults are 
associated with significant declines in the numbers of 
known residents. For the dark-eyed junco population 
at least, the overall decline may be driven by a signif- 
icant decline in the numbers of young. 

Breeding Bird Survey trends for these species in the 
western coastal states region agree with the MAPS 
adult trends in nine out of ten cases. The MAPS trend 
for adult Swainson’s thrush shows an increase and the 
BBS trend a dectine. 


Competing Mark-recapture Models 


For each species, comparisons of the survivorship 
probability, capture probability, and resident propor- 
tion from three competing mark-recapture models— 
SURVIV, TMSURVIV, and LOSSURVIV—are shown 
in Table 64.3. In all cases, the time-independent tran- 
sient model represents the best (or equivalent) model 
reported by TMSURVIV or LOSSURVIV based on 
the values of Akaike Information Criteria (Akaike 
1981; Burnham and Anderson 1992) associated with 
each combination of time dependent and time-inde- 
pendent @, P, and t models. The values of both 9 and 
P parameter estimates are significantly greater for 
both TMSURVIV and LOSSURVIV than for SURVIV 
(P « 0.01, two-tailed t-test), whereas exactly half are 
greater and half are smaller when comparing 
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Total number and short-term trends (1992-1998) in all adults (Adult), known residents (KNRES), adults seen once 
(ASO), and young (Young) for ten bird species well represented in the MAPS data covering USDA Forest Service 


Region 6. 

Code? Adult Trend KNRES Trend ASO Trend Young Trend BBS> Pic 
WEFL 470 —4.0 92 -2.2 378 —1.8d 88 Q3 -2.799 0.14 
WIWR 534 152 138 -2.8 396 4.14 266 25 5574 031 
SWTH 2213 516 947 05 1266 52 342 -3.2 -2.254 0.09 
AUWA 379 -1.2 55 -2.0 324 0.8 186 -4.0 -1.08 0.29 
TOWA 447 -8.594 68 -7.14 379 -1.4 469 -6.6 -2.26 0.45 
MGWA 1158 -4.8 407 -5.6 Tod -0.8 720 -5.9 -0.9 0.32 
WIWA 776 -3.5 199 -2.9 557 -0.6 312 -5.7 -1.46 0.28 
SOSP 484 E 219 0.5 265 4.5d 492 215 1.31 0.45 
LISP 550 0.9 279  -0.7 20L 1.6 267 0.0 2:370 20125 
ORJU 889 -2.1 PT -3.74 612 aL Tf 939 -18.34 -1.224 0.45¢ 


aSpecies abbreviations: WEFL—Empidonax difficilis and E. occidentalis; WIWR—Troglodytes troglodytes; SWTH—Catherus ustulatus; 


AUWA—Dendroica coronata audubonii, TOWA—Dendroica townsendi; MGWA—Oporornis tolmiei, WWWA—Wilsonia pusilla; SOSP— 
Melospiza melodia; LISP—Melospiza lincolnii; and ORJU—Junco hyemalis oregonus. 
bNorth American Breeding Bird Survey (BBS) regional population trends for British Columbia, Canada, and Washington, Oregon, and 


California, USA. 


cThe mean annual proportion of young (PI) in the total catch (pooled across all stations) expressed as (Young/[Adults + Young]). 


dStatistically significant (P < 0.05). 
eSignificantly decreasing productivity (P = 0.01). 


TMSURVIV with LOSSURVIV for $ and P (non- 
significant). The @ and P increases that result from 
using LOSSURVIV over TMSURVIV are associated 
with decreases in the estimates of the resident pro- 
portion (T). Conversely, the ó and p decreases that re- 
sult from using LOSSURVIV rather than TMSURVIV 
are associated with increases in 7 for only three of five 
species. The percent change in the precision of the es- 
timate of survivorship (@) results from comparing the 
coefficient of variation (standard error of the estimate 
of @/ estimate of ¢) from LOSSURVIV with that from 
TMSURVIV. Precision increased for all species, with 
values ranging from 1 to 29.3 percent and with a 
mean improvement of about 16 percent (P « 0.01, 
two-tailed t-test), indicating that estimates of survival 
rate produced by LOSSURVIV are generally more pre- 
cise than those produced by TMSURVIV. 

A plot of the mean annual productivity indices 
(proportion of young in the catch) derived from Table 
64.2, against the survival estimates from LOSSURV in 
Table 64.3, is shown in Figure 64.3. A regression line 
reveals a negative relationship in which a high survival 
rate is associated with a low productivity index and a 


low survival rate is associated with a high productivity 
index. Importantly, survival rates of temperate-winter- 
ing species tend to be lower than those associated with 
species with intermediate or tropical;wintering strate- 
gies. In turn, survival rates for these species is lower 
than those associated with tropical-wintering species. 


Discussion 


Analysis of constant-effort mist-netting data can de- 
tect annual changes in population size and structure as 
well as seasonal or diurnal patterns in the capture 
rates of resident adult, transient adult, and juvenile 
portions of avian populations. This information can 
be used to increase the accuracy and precision of de- 
mographic parameter estimates. Capture histories de- 
rived from these data show interspecific differences in 
temporal patterns of activity. For some species, the 
probability of capture peaks during the first few hours 
of the morning, whereas for others it remains rela- 
tively constant over a larger part of the morning. 
Similarly, the probabilities of capturing young and 
adults vary by species throughout the season. This 
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TABLE 64.3. 


Comparison of estimates of survivorship probability (6), capture probability (P), and resident proportion (x) from SURVIV, 
TMSURVIV, and LOSSURVIV for ten bird species? utilizing MAPS data for the period 1992-1998 in USFS Region 6. 


SURVIV TMSURVIV LOSSURVIV 

cv CV CV ACV 
Species? (^) P  CVW(P) o (6) P CV(P) T CV(7) op (>) P CV(P) T CV(t) (6)? 
WEFL 0.505 12.7 0.150 24.0 0.553 13.0 0.215 31.2 0.592 34.8 0.561 11.2 0.232 241 0.461 25.8 13.7 
WIWR 0.233 18.5 0.460 22.8 0.364 19.8 0.648 16.8 0.399 29.3 0.314 16.2 0.593 17.9 0.430 244 17.9 
SWTH 0.494 2.6 0.533 3.9 0.585 3.2 0.601 3.7 0660 5.6 0.577 26 0.597 3.5 0546 5.9 20.0 
AUWA 0.481 18.5 0.111 34.2 0.582 17.4 0.251 38.2 0.320 44.4 0.549 17.1 0.194 35.11 0.403 37.7 258] 
TOWA 0.349 16.9 0.236 25.8 0.427 17.1 0.370 27.6 0.475 34.9 0.390 16.7 0.315 283 0.609 31.9 2.5 
MGWA 0.398 10.1 0.523 7.3 0.525 5.9 0.633 6.0 0.533 9.9 0.495 4.8 0.612 6.0 0.487 99 17.9 
WIWA 0.359 10.0 0.343 15.2 0.425 11.3 0.442 15.8 0.607 21.4 0.445 9.0 0.467 12.8 0.423 16.8 20.4 
SOSP 0.371 84 0.604 10.6 0.394 11.7 0.626 11.0 0.884 17.9 0.436 8.3 0.658 93 0.584 15.6 29.3 
LISP 0.418 6.0 0.654 7.5 0.427 8.9 0.662 8.0 0.957 13.4 0.439 6.8 0.670 7.3 0857 1399 23.2 
ORJU 0.413 7.5 0.345 11.9 0.415 9.6 0.347 15.9 0.989 19.3 0.433 8.1 0.373 12.6 0.832 14.5 16.1 
Mean 0.402 0.396 0.470 0.480 0.642 0.464 0.471 0.563 16.2 


Note: In all cases the time-independent transient model represents the best (or equivalent) model reported by TMSURVIV or LOSSURVIV. 
Coefficients of variation (standard error/estimate) are calculated for each parameter and shown as percentages (in boldface italic type). The 
percent change in the precision of the estimate of survivorship (f) results from comparing the coefficients of variation from LOSSURVIV with that 
from TMSURVIV. A positive value indicates that the estimate of survival rate from LOSSURVIV is more precise than that from TMSURVIV. 

aSpecies abbreviations: WEFL—Empidonax difficilis and E. occidentalis; WIWR—Troglodytes troglodytes; SWTH—Catherus ustulatus; AUWA— 
Dendroica coronata audubonii; TOWA—Dendroica townsendi; MGWA—Oporornis tolmiei; WIWA—Wilsonia pusilla; SOSP—Melospiza melodia; LISP— 


Melospiza lincolnii; and ORJU—Junco hyemalis oregonus. 


strongly suggests that foraging strategies vary by 
species. Obviously, when indexing productivity by the 
proportion of young in the total catch, the seasonal 
timing of missing effort can bias the results. If effort is 
missed early in the season, the number of adult cap- 
tures may be underestimated and bias productivity 
high. Conversely, if effort is missed late in the season, 
the number of young may be underestimated and bias 
productivity low. Let us assume, for a given species, 
one hundred adults and forty young should have been 
captured, 20 percent of the adults are normally cap- 
tured in the first banding period and 40 percent of the 
young are normally captured in period 10. If half of 
the effort in the first period were missed, then the pro- 
ductivity index (proportion of young in the catch) 
would be biased 17 percent higher than expected. If 
half of the effort in period 10 were missed, the pro- 
ductivity index would be 15 percent lower than ex- 
pected. Because adult and young captures are a func- 
tion of both the time of day and the banding period, 
species-specific temporal patterns of capture probabil- 


ity must be used to adjust productivity indices when 
banding effort is missing. 

In fact, most mist-netting effort is missed early or 
late in the season due to unfavorable weather, is 
missed early in the day due to logistic or weather 
problems, or is missed late in the day due to high am- 
bient temperatures forcing nets to be closed (P. Nott 
unpublished data). Because interspecific diurnal activ- 
ity patterns vary, effort missed in a particular hour 
may differentially underestimate numbers of captures 
among species. It is also likely that these temporal ac- 
tivity patterns vary geographically and with patterns 
of environmental conditions (e.g., temperature and 
humidity). We propose that to provide comparisons of 
annual indices of productivity over many years, cor- 
rections for missing effort should be based by region 
and on the expected proportion of the catch by both 
hour and by banding period. 

The first step in this process involves constructing 
matrices (hours by periods) expressing the expected 
proportion of resident, transient, and young captures 
in each time slot for each species. Young birds, except- 
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Figure 64.3. Relationship between mean annual productivity 
and survival derived from banding data for ten species of 
passerines breeding in the Pacific Northwest region of the 
United States. Productivity is expressed as the annual mean of 
the proportion of young in the total catch from banding data 
pooled across thirty-six stations. Survival rates are estimated 
from a modified Cormack-Jolly-Seber mark-recapture model that 
considers transient individuals (see text). Three temperate-win- 
tering species (upward pointing triangles), three temperate/ 
tropical-wintering species (downward pointing triangles) and 
four tropical-wintering species (diamonds) are shown. Linear re- 
gression shows a significant negative relationship (P « 0.05). 


ing those subsequently recruited into the population, 
are generally caught only once and therefore generally 
require only a simple proportional correction as do the 
number of adults seen once. The relationship between 
missing effort and changes in the resident proportion 
of the population is more complex. Many between- 
year captures are also caught multiple times within a 
year and so would remain classified as residents until 
considerably more effort is missed and they become 
classified as adults seen once or even transients. Con- 
versely, as effort increases, a portion of the adults seen 
once may be recaptured in another period and reclassi- 
fied as within-year residents. The proposed method to 
obtain species-specific correction rates for each of the 
time slots and demographic groups is to apply a simple 
Monte Carlo method. Small increments of effort may 
be removed randomly (and repeatedly) from existing 
data, pooled across a number of stations, by period 
and hour. This process facilitates the formulation of 
time-specific (diurnal and seasonal) rate equations ex- 
pressed as the proportion of individuals lost per unit of 


missing effort. These relationships can then be used to 
correct for missed effort by adjusting the numbers of 
individuals caught at individual banding stations. 

Although MAPS adult population trends generally 
agree with BBS trends, it is important to note that the 
two protocols are very different. BBS point count data 
is collected from 50-kilometer-long roadside routes 
representative of a number of different habitat types 
and environmental conditions. As such, BBS provides 
a good indicator of overall numbers of birds of many 
species. On the other hand, MAPS data is collected 
from spatially restricted forest interior or offroad for- 
est edge plots, and, because it distinguishes between 
adults and young, can provide indices of productivity, 
estimates of survivorship, or estimates of the propor- 
tion of transient individuals in the numbers of adults 
detected. In this way, constant-effort bird banding can 
identify the proximal causes of population changes. In 
the case of the Oregon junco (J. hyemalis oregonus), it 
seems likely from banding data that the number of 
young produced in a year are declining rapidly. For 
Swainson's thrush, the numbers of detections in BBS 
data are increasing as the numbers of adults seen once 
are increasing, but the numbers of resident adults cap- 
tured show no temporal trend. 

Perhaps, in cases where there is a great disparity be- 
tween trends of adults seen once and resident adults, 
the source-sink dynamics of those populations are 
changing. Fortunately, morphological data collected 
from captured individuals allow banders to identify 
the age (Pyle 1997) and breeding status (identified by 
brood patch and cloacal protuberance) of captured 
birds. It is possible that further analysis of the banding 
data may reveal stations at which, for one or more 
species, the age of recruited breeding birds has 
changed over time. In turn, this might suggest a shift 
from source to sink population or vice versa. 

Obviously, open-population mark-recapture mod- 
els (TMSURVIV and LOSSURVIV) that estimate de- 
mographic parameters based on within-year and/or 
between-year information provide higher values for 
survival rates and recapture probabilities than does 
the closed-population model SURVIV. Although the 
net overall difference in the estimates of survival 
rate produced by TMSURVIV and LOSSURVIV is 
minimal (about 1 percent per species) and statistically 


736 PREDICTING SPECIES OCCURRENCES 


insignificant, LOSSURVIV provides a significant im- 
provement in the precision of those estimates (about 
16 percent). LOSSURVIV also conserves more of the 
information in the capture histories, which may be re- 
sponsible for the improvement in precision surround- 
ing survival rate estimates. 

Across species, we detected a negative correlation 
between survival rates and productivity indices, which 
adheres to generally accepted life history theory (e.g., 
Martin 1995), whereby species with higher survival 
rates (e.g., long-distance migrants) produce fewer 
young than do species with lower survival rates (e.g., 
short-distance migrants). Within species, however, an 
analysis of banding data collected across a range of 
latitudes (and elevations) will facilitate testing of alter- 
nate life-history theories. One such theory, the “time- 
allocation hypothesis” (Greenberg 1980) predicts that 
within species long-distance migrants survive better at 
higher latitudes (or elevations) than at lower latitudes 
(or elevations) because they have compensated for 
having less time to devote to reproduction by evolving 
to live longer. 

For several target species, genetic analyses (Mila 
2000) and isotope analyses (J. F. Kelly pers. comm.) of 
feather and blood samples taken from both North 
American populations in the breeding season (MAPS 
program) and from populations wintering in North 
America, Mexico, and Central America make it possi- 
ble to identify discrete breeding and wintering popula- 
tions of individual species (e.g., MacGillivray's war- 
bler and Wilson's warbler). Then for those species that 
exhibit spatial demographic structure, such as that 
caused by a leap-frog migration strategy (e.g., Bell 


1997), it may be possible to quantify the effect of en- 
vironmental stressors (e.g., weather or land use 
changes) on various stage(s) of species’ life cycles and 
their relative contributions to trends in population 
sizes. Further research will therefore explore the rela- 
tionships among regional variations in age structure, 
morphometrics, survivorship, and productivity toward 
a better understanding of source-sink population dy- 
namics, phylogeography, and life-history theory. 
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Predicting Species Üccurrences: Progress, 
Problems, and Prospects 


Jobn A. Wiens 


D. one way or another, predicting the occurrence of 
species in space and time comes down to dealing 
with habitats. The issues that we face in predicting oc- 
currences therefore are issues of habitat—how we de- 
fine it; how we measure, map, and model it; how we 
analyze it; what it means to organisms; and, ultimately, 
how we can use knowledge about habitat to manage 
natural resources in a sustainable and balanced way. 

These are not novel insights. Awareness of “habi- 
tat" and its importance has been with us for a long 
time. It was the foundation of the observations of 
early naturalists, such as Gilbert White, Henry David 
Thoreau, or Alexander von Humboldt, and it became 
formalized in the writings of Charles Darwin, Alfred 
Russell Wallace, John Muir, and Aldo Leopold. It con- 
tinues as a thread that runs through the thinking and 
work of all modern ecologists. For some, however, it is 
buried deep in their subconscious. This is especially 
true of some theoretical ecologists, although certainly 
not all. After all, Robert MacArthur, perhaps the most 
influential theoretical ecologist of the twentieth cen- 
tury, really knew his birds. And *habitat" has become 
the foundation on which management of species and 
communities is often based. 

All of these approaches, from the observations of 
early naturalists to the applications of modern land 
managers, are founded on the premise that there is a 
close relationship between the occurrence and abun- 


dance of species and characteristics of their habitats, 
that we can predict what will happen to species by 
knowing their habitat relationships. There have been 
times when we thought we had it right, that we could 
actually predict species occurrences with a high degree 
of certainty and accuracy. During the 1950s and 1960s, 
for example, niche theory was thought to provide a rea- 
sonable abstraction of the critical features of habitat. 
This approach gave way in the 1970s and 1980s to 
the construction of models based on multivariate 
statistics— principal components analysis (PCA), dis- 
criminant function analysis (DFA), and a variety of 
others—what Dean Stauffer (Chapter 3) has referred to 
as the “multivariate muddle” period, although it might 
more appropriately be termed *multivariate madness." 
Yet, every time, as we learned more ecology, we realized 
that critical components were missing from these ap- 
proaches. I know. I began my career basing my analyses 
of habitat relationships on niche theory (Wiens 1969) 
and then went through a multivariate phase—thanks to 
John Rotenberry (Wiens and Rotenberry 1981b). Each 
time I thought I had uncovered the “true” habitat rela- 
tionships only to realize that my “other things being 
equal? assumption contained too much interesting ecol- 
ogy to ignore. Now I’m in a spatially explicit landscape 
phase; we'll see where that leads. 

So thinking about and assessing “habitat” has 
evolved. Perhaps the surest way to see where pro- 
gress has been made and problems remain—for the 


NO 
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detection of new, unforeseen problems is an inevitable 
consequence of progress—is to compare the present 
volume with the Wildlife 2000 collection published 
sixteen years ago (Verner et al. 1986b). My intent here 
is to offer a collection of thoughts, perspectives, and 
observations that have been prompted by these books, 
by hallway discussions at the Snowbird conference, 
and by my own tortuous pathway toward understand- 
ing “habitat.” My presentation will be more in the 
style of an extended essay than a comprehensive syn- 
thesis or review. Thence, I will refer to the published 
literature only sparingly. 


Progress 


Dean Stauffer (Chapter 3) has provided an enlighten- 
ing and entertaining historical perspective on how we 
got where we are. Clearly, approaches to wildlife- 
habitat evaluation have changed a lot, and we’d like 
to think that most of the change represents progress. 
Looking over Wildlife 2000, one finds considerable at- 
tention devoted to describing and (occasionally) test- 
ing habitat models, many of them based on a habitat 
suitability index (HSI) protocol. Statistical analyses of 
habitat relationships highlighted regression ap- 
proaches and multivariate procedures, and although 
the emphasis was on linear relationships, the HSI for- 
mulations often included strong response thresholds. 
Few of the papers in that volume considered temporal 
or spatial variation explicitly (although habitat frag- 
mentation was a major topic of concern), and even 
fewer mentioned issues of scale. 

Have things changed all that much? In their intro- 
ductory remarks to Wildlife 2000, Jerry Verner, Mike 
Morrison, and C. J. Ralph observed that “What we 
need now are some appropriate ways to organize our 
information and get it into computer files, and suitable 
methods for analyzing the information in those files.” 
In that respect, the past sixteen years have certainly 
seen remarkable progress. Progress is evident in other 
areas as well. PI] comment on this progress in three 
areas: tools, concepts, and sociology. 


Tools 


The most obvious evidence of progress in dealing with 
wildlife-habitat, relationships is in the size and sophis- 


tication of the toolbox. One of the distinguishing fea- 
tures of many of the chapters in this volume, for ex- 
ample, is the size of the data sets analyzed. Dealing 
with thousands of sampling points, or hundreds of 
thousands of pixels, would have been unimaginable a 
decade ago. Moore's Law, named after Intel founder 
Gordon Moore, suggests that transistor density, and 
thence microprocessor speed, doubles every eighteen 
months. There are theoretical limits to this function, 
but we don't seem near them yet. The crunching ca- 
pacity (CC) of computers continues to increase expo- 
nentially, and progress is being made in dealing with 
the problem of mismatches in the scale of data related 
to different components of an ecosystem or of a 
model—variously known as the transmutation prob- 
lem or the Modifiable Areal Unit Problem (Henebry 
and Merchant, Chapter 23). In many cases, the wealth 
of data derives from our capacity to gather informa- 
tion by remote sensing and analyze the data using ge- 
ographic information systems (GIS). The development 
of GIS has had a tremendous influence on assessing 
species’ occurrences, enabling us to map (with appar- 
ent precision) the spatial pattern and distribution of 
multiple habitat and environmental features. Virtually 
all of the empirical chapters in this book make use of 
GIS in one way or another. Gap analysis, and its ap- 
plication to mapping biodiversity, land use, and stew- 
ardship, would be impossible without GIS. Beyond 
these obvious uses, GIS has led to closer examination 
of classification and categorization procedures and to 
the development of numerous metrics to describe 
landscape patterns. It is difficult to underestimate the 
importance of this tool to what we do. 

Progress in the area of data analysis, while not so 
obvious, has been just as remarkable. Advances in re- 
gression modeling, such as generalized linear modeling 
(GLM), generalized additive modeling (GAM), and 
Poisson regression have increased analytical power, 
and regression has been coupled with hierarchical 
classification approaches in classification and regres- 
sion tree (CART) analyses. (I suppose that a useful 
index of progress might be the proliferation of 
acronyms in a science.) New approaches developed in 
other disciplines, such as genetic algorithm for rule-set 
prediction (GARP) modeling or neural network mod- 
eling, are just beginning to be applied to habitat 
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evaluation, but their potential appears to be great. 
Progress has also been evident in how we deal with 
space. Ecologists, modelers, and managers have long 
realized that natural systems vary spatially and that 
this variation affects the dynamics of natural systems. 
It can also confound conventional statistical analysis. 
A rapidly growing suite of spatial statistics, such as 
autologistic regression, semivariance analysis, kriging, 
and correlograms (see Fortin [1999] or Dale [1999] 
for useful and understandable reviews), is providing 
the tools necessary to describe spatial patterns quanti- 
tatively as well as to assess the magnitude of spatial 
autocorrelation and thereby evaluate its potential ef- 
fects on more conventional, nonspatial approaches. 

In the era of Wildlife 2000, statistics and modeling 
(primarily simulation modeling) were largely separate 
activities: one analyzed the data and then used the re- 
sults to specify parameter values (usually only means) 
as inputs to simulation models. Increasingly, statistical 
analysis blends into modeling, and both are being 
linked to spatial and cartographic analyses via GIS. In 
addition, rather than seeking a single wildlife-habitat 
model that works and labeling it *best," modelers are 
now developing multiple models to address a particu- 
lar situation and are using statistical tools such as 
Akaike's information criterion (AIC) to select among 
alternatives. Considerable attention is being given to 
evaluating model performance, both in terms of the 
logics of the model structure and functions, and the 
accuracy of the predictions about species occurrence. 
Perhaps the greatest change in modeling, however, is 
the level of detail and complexity that can be included 
in models as a consequence of increased computer ca- 
pacity and speed. This has led to the development of 
models in which the dynamics and interrelationships 
among numerous locations (spatially explicit models, 
or SEMs) or individuals (individual-based models, or 
IBMs) can be tracked in mindboggling detail. In- 
evitably, these approaches have given rise to integrated 
spatially explicit individual-based models (SEIBMs). 
Or, in a delightful play on acronyms, the individual 
cowbird behavior model (ICBM; Shapiro et al., Chap- 
ter 49). We are reaching the point where we can incor- 
porate additional levels of complexity into models as 
fast as we can recognize them. The potential develop- 
ment of technologies such as optical or molecular 


computing or self-organizing computational networks 
suggests that modeling complexity may be limited 
more by our ingenuity than by our technology—as, 
perhaps, it has always been. 


Concepts 


Ultimately, our ability to use tools intelligently de- 
pends largely on how we think about Nature or, more 
proximately, wildlife-habitat relationships and species 
occurrences. Theory and concepts provide the intellec- 
tual context for our efforts. As in any discipline, the- 
ory and concepts in ecology develop in fits and starts, 
largely because the advances that foster breakthroughs 
tend then to establish a tenacious hold that may slow 
further development—the continued dominance of the 
assumption-laden logistic model in population biology 
is a good example. Kingsland (1985) provides a splen- 
did review of this process, while Peters’ (1991) treat- 
ment is more caustic. Nonetheless, ecological thinking 
has advanced considerably over the past sixteen years, 
largely due to a shedding (at least in part) of concepts 
and theories founded on assumptions of temporal 
equilibrium and spatial homogeneity in favor of a 
recognition of the importance of variation in time and 
space. These changes have affected both how we think 
about organisms and populations and how we view 
the environments they occupy. On the biological side 
of the ledger, concepts such as keystone species, um- 
brella species, functional types, and ecological redun- 
dancy have influenced how we regard the role or im- 
portance of particular species in an ecological context. 
Metapopulation theory and population viability | 
analysis (PVA) have opened new avenues for assessing 
the spatial, or demographic, structure of populations 
and its consequences. And, of course, “biodiversity” 
(which traces its roots directly to the focus on species 
diversity that dominated community ecology during 
the 1960s, or thinking about species-abundance distri- 
butions that developed decades earlier) has spawned a 
plethora of papers, books, web sites, and videos, even 
though there is not a clear consensus about what the 
word really means. 

Then there is the matter of how we deal with envi- 
ronmental variation. More than a generation ago, 
Robert MacArthur (1972a,b) advised us to study ho- 
mogeneous areas of habitat, where ecological patterns 
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should emerge most clearly. For a decade (or two), 
ecologists followed this advice, ignoring the hetero- 
geneity that confronted them both within and among 
habitats. Heterogeneity is now receiving much closer 
attention, both in terms of description and measure- 
ment and of its consequences for population dynamics 
and community structure. We are beginning to con- 
sider the linkages that exist between temporal varia- 
tion and spatial variation and how they can affect 
both how we model and how we manage habitat for 
wildlife. But perhaps the greatest shift has occurred in 
our thinking about scale (extent and grain), from ig- 
noring it almost completely (as in Wildlife 2000) to re- 
alizing that everything we observe and model is sensi- 
tive to scale. (The collection assembled by Peterson 
and Parker [1998b] provides an excellent introduction 
to the manifold ways of viewing scale and its conse- 
quences.) The emergence of landscape ecology has 
done much to focus attention directly on issues of het- 
erogeneity and scale. 


Sociology 


There is more to science and its applications than the- 
ories, concepts, models, and data. People are involved. 
Without going deeply into the demographic and cul- 
tural changes that have affected science and manage- 
ment in the last sixteen years, let me call attention to 
two significant changes, one of which represents real 
progress, the other, perhaps not. 

If one takes the contributions and contributors to 
Wildlife 2000 as reasonable indicators of the field of 
wildlife-habitat evaluation and management in the 
mid-1980s, it is apparent that women were not well 
represented. By my count, only 9 percent of the au- 
thors of chapters in that volume were women. In the 
present volume, 28 percent of the authors are women, 
a three-fold increase. Although this proportion does 
not yet come close to matching the representation of 
women in graduate programs in wildlife management 
and ecology, it nonetheless indicates substantial 
progress. Less progress has been made in involving 
ethnic minorities in the task of evaluating and apply- 
ing wildlife-habitat relationships. 

The other sociological aspect has to do with the 
transfer of new knowledge, approaches, and technolo- 
gies into management practice and policy. Applied 


ecology is often used to refer to ecological studies that 
have some real-world relevance, but the term has real 
meaning only if the results of the science actually af- 
fect the implementation of management at some level. 
There is abundant evidence that many of the buzz- 
words of modern ecology, such as biodiversity, ecosys- 
tem management, indicator species, and landscape, 
have made their way into the lexicon of policy and 
legislation, but the science behind these terms has not 
permeated practice to an equivalent extent. This re- 
quires that the findings and thinking of science be 
communicated to managers and policy makers in un- 
derstandable terms and that the needs and priorities of 
management and policy be communicated to scientists 
in equally understandable terms. Wildlife 2000 was 
explicitly focused on fostering this communication 
and was at least moderately successful. Since then, 
however, the gulf between science and management 
has widened instead of closing. The very innovations 
in technology and concepts that have contributed so 
much to our progress in modeling wildlife-habitat re- 
lationships have made it increasingly difficult for man- 
agers, whose training has been in other areas, to keep 
up. Moreover, ecologists and researchers still seem 
more interested in talking to one another than in com- 
municating with practitioners and policymakers. PI 
return later to this aspect of the sociology of applied 
science. 


Problems 


Given all of the progress we’ve made, one would think 
we would be closer to solutions, to achieving a real 
understanding of wildlife-habitat relationships, so we 
could generate robust and reliable predictions. We are. 
Through no fault of their own (except history), many 
of the contributions to Wildlife 2000 now seem overly 
simplistic or downright naive. Yet, the new tools and 
concepts that have encouraged all this progress have 
also exposed new problems, many of which we were 
only dimly aware of, or actively swept aside or sup- 
pressed, as we defended our beloved paradigms and 
procedures. Equilibrium thinking and linear regression 
come quickly to mind. Perhaps one additional sign of 
progress is that many of the contributions to this vol- 
ume consider these problems in one way or another. 
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But the concerns are so basic to what we want to do 
that, at the risk of generating a litany of criticisms, I'll 
highlight those that to me seem most critical, touching 
on all of the key words in the title of this book (and 
then some). 


The Use and Misuse of Tools 


Models of one sort or another have become our pri- 
mary means of assessing wildlife-habitat relationships 
and generating predictions of the consequences of 
habitat change. Ultimately, the strength or weakness 
of models is related to the tools used to construct or 
provide information to the models, the data used to 
drive the models, and the internal structure of the 
models themselves. Technological advances have given 
us tools that are immensely powerful, but behind this 
power lies the peril of indiscriminate use, of mindless 
application. We can calculate the fractal dimension of 
all sorts of things, for example, and there is a rich 
body of theory and application of fractal geometry in 
other sciences. But what do fractals really mean, eco- 
logically? We have precious little theory to suggest 
what we might expect from a movement pathway or a 
landscape of a given fractal dimension in terms of eco- 
logically important consequences. It would be prema- 
ture, however, to disregard fractal measures simply be- 
cause we haven't yet developed fractal ecology theory. 
Milne (1997) provides a nice perspective on the uses 
of fractals in wildlife biology. 

GIS analysis has become a major bandwagon in 
ecology, and justly so. Yet, the ability to generate mul- 
tilevel, brilliantly colored maps and to link these with 
spatially explicit computer models does not ensure 
that we have a better understanding of ecologically 
important process-pattern linkages. Because our 
choice of variables and scales is often fixed by the 
technology (e.g., particular spectral bandwidths or 30- 
meter pixel sizes in remote sensing), the "truth" that 
emerges in a computer-generated map is as much a 
product of the constraints as of the data. One can 
lie—inadvertently, I’m sure—with maps just as easily 
as one can lie with statistics. Van Horne (Chapter 4) 
and Henebry and Merchant (Chapter 23) provide 
some additional cautionary words about the overly 
enthusiastic use of GIS in wildlife-habitat analysis and 
modeling. 


In some cases, we aren't making sufficient use of 
the tools that are available. All environments are het- 
erogeneous and all organisms are aggregated at one 
scale or another. We know that this spatial hetero- 
geneity can have profound ecological consequences 
(for a sampling see Kolasa and Pickett 1991; Tilman 
and Kareiva 1997; Maudsley and Marshall 1999; 
Hutchings et al. 2000), yet the expanding array of 
spatial statistics that can be used to analyze spatial 
patterns is still largely ignored in wildlife-habitat in- 
vestigations, For example, of the chapters in this vol- 
ume that include specific, empirical analyses, hardly 
any use spatial statistics. As Austin (Chapter 5) has 
warned, however, it is risky to ignore spatial autocor- 
relation. Beyond this, there is a wealth of information 
buried in the spatial patterns of organisms and habi- 
tats, and spatial statistics provide a way to distill that 
information. 


Variable Selection 


It is trite to say that models (at least empirical ones) 
are only as good as the data that drive them. Our ca- 
pacity to gather data has never been greater. But what 
sorts of data should we gather? Ecologists, wildlife bi- 
ologists, and modelers have grappled for decades with 
the issue of which habitat variables to measure (Mor- 
rison et al. 1998 review this history). To some degree, 
the selection of variables has been based on a knowl- 
edge of the natural history of the target organisms, but 
it has also been strongly influenced by tradition (i.e., 
we measure what those before us have measured) and 
by the availability of tools (e.g., if we are interested in 
measures of landscape structure, we go to 
FRAGSTATS and follow the directions). As the collec- 
tion of candidate variables for habitat analysis contin- 
ues to grow, it becomes more difficult to choose 
among them and more tempting to employ a “shot- 
gun" approach, hoping that computer models or sta- 
tistical analyses will, in the end, tell us which ones are 
important. Moreover, every variable we measure has 
an associated variance—that is, after all, why they're 
called variables. Although consideration of this vari- 
ance is part of any statistical analysis, we still tend to 
regard it as “sampling error" or “noise,” something to 
be controlled for so we can get at the real meaning of 
the means. But, as Dan Simberloff once observed, 
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what is noise to a physicist (or a statistician) is music 
to an ecologist. (Dan failed to specify the kind of 
music; given contemporary tastes, this could span a 
wide array of discordant harmonies [cf. Botkin 
1990].) In other words, the variance contains impor- 
tant biological information in addition to an in- 
evitable stochastic component, and we would do well 
not to ignore it. What does it tell us about habitat 
quality, for example, if a species occurs at low density 
in one area and high density in another, even though 
densities vary among years much more in the latter 
area? Not much, perhaps, but the contrast is worth 
considering, and by focusing on mean values we may 
miss important insights. 


Model Evaluation 


We must also be concerned about model structure 
and evaluation. It is well and good to insist that 
models be “logically consistent,” but what, exactly, 
do we mean by this? That the model operations are 
founded on proper mathematics? That circularity is 
avoided? That the model functions portray biologi- 
cally feasible processes? Ideally, a “good” model in- 
cludes all of these. We usually evaluate models, how- 
ever, by judging their goodness of fit (i.e., their 
accuracy), either in a comparison with the original 
data from which the model was generated—a logical 
no-no for any purposes other than determining 
whether the model works (i.e., validation)—or with 
freshly minted data. But what is the standard for 
such evaluations? Perhaps the model could be com- 
pared with a null or neutral model, except that we 
know some such models are so unrealistic that a de- 
parture from them only indicates that the model 
under consideration may have some degree of bio- 
logical realism (Gotelli and Graves 1996; With and 
King 1997). Or, perhaps we could develop multiple, 
alternative models and pick the “best” model using 
an objective measure (e.g., AIC), but such compar- 
isons are only as good as the set of models consid- 
ered. Being the *best" among a series of unrealistic 
or poor models does not make a model *good." 
Model evaluation is a central focus of many of the 
contributions to this volume, so the problem is 
clearly receiving a lot of attention. 


Prediction and Accuracy: 
The Holy Grail of Wildlife Biology 


Certainly one of the best measures with which to eval- 
uate models, and the one in which we are ultimately 
most interested, is predictiveness. In the realm of 
wildlife-habitat management, the most elegant and 
logically consistent model remains an esoteric toy un- 
less it can actually predict something useful. Pre- 
dictability, however, is itself problematic. What, ex- 
actly, is it that we want to predict? The simple 
presence of a species? Its abundance? The annual vari- 
ance in abundance? Its sensitivity to habitat change? 
Its population dynamics? Clearly, different questions 
demand different approaches, and much of the variety 
of models represented in this volume undoubtedly re- 
flects differences among authors in what they wanted 
to predict. We also aim for accuracy in our predic- 
tions. Indeed, accuracy is the motherhood-and-apple- 
pie of modeling and quantitative ecology. Because of 
this, there may be a temptation to measure those 
things that we can measure with great accuracy, hop- 
ing that accuracy in measurement will translate into 
accuracy of model predictions. Again, we need to ask 
whether these are the right variables to measure. 
Instead of endlessly pursuing ever-greater accuracy, 
we should also ask how much accuracy we really 
need. Will a marginally significant model—P > 0.05, 
of course—suffice if our objective is to assess coarse 
associations between wildlife and habitats (even 
though such a model would have a low probability of 
acceptance (P « 0.05) in a mainstream scientific jour- 
nal)? Conversely, is a model that generates a signifi- 
cant relationship between variables really useful if the 
amount of variance it explains (the oft-overlooked R2 
value) is low? We have all seen (and some of us have 
published) scatter plots containing a large cloud of 
points intersected by a "significant" regression line. 
As our capacity to gather data (and therefore gener- 
ate large sample sizes) increases, the likelihood of 
finding such significant but meaningless relationships 
also increases. And what do we do when a seemingly 
good wildlife-habitat model (as judged by a high R2, 
for example) fails to predict what it should? This was 
the problem that John Rotenberry posed in his contri- 
bution to the Wildlife 2000 volume (*Habitat Rela- 
tionships of Shrubsteppe Birds: Even ‘Good’ Models 
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Cannot Predict the Future.” In fact, an entire section 
of that volume was devoted to the problem of the 
failure of habitat models as predictors.). John and I 
have recently confirmed the accuracy of his title in a 
resoundingly discouraging way. Beginning in 1977, 
we surveyed bird populations and measured habitat 
features over six years at fourteen sites in the shrub- 
steppe of Oregon and Nevada. Despite annual varia- 
tion, we were able to develop significant regression 
models (P « 0.01) of bird abundance and habitat vari- 
ables that had a reasonably good fit. The multiple re- 
gression model for sage sparrows (Amphispiza belli), 
for example, had an R? value of 0.60. John and I re- 
turned to the shrubsteppe in the summer of 1997 and 
resurveyed the same transects using the same methods 
(but with older observers). We used the 1997 habitat 
data as input to the 1977-1982 model to generate 
predicted sage sparrow abundances in 1997. The 
match with the observed abundances was, to say the 
least, poor (Fig. 65.1). Being good ecologists, we can 
come up with several post facto explanations for why 
the model predictions failed. For example, events 
elsewhere—such as El Nifio, which we tend to blame 
for everything we can't otherwise explain—or events 
in the past may have affected current abundances and 
may have done so differently in different areas. De- 
partures from habitat occupancy based on an ideal- 
free distribution, or due to individual variation in 
habitat selection, or associated with stochastic 
factors—the other favorite explanation for variations 
we can't otherwise explain—could also erode the fit 
between observations and predictions. Perhaps more 
sophisticated modeling approaches might yield better 
predictions. The basic problem, however, is that we 
are dealing with a dynamic, nonequilibrial system in 
which different factors, acting at different scales, with 
varying time lags, determine abundances at the scale 
of local survey plots in a given year (or decade). Stud- 
ies of Townsend's ground squirrels (Spermophilus 
townsendii [now Piute ground squirrel, Spermopbilus 
molis]) by Van Horne et al. (1997) illustrate how 
*good" habitat (as judged by demographic measures) 
in a wet year can become “poor” habitat during a 
drought, and vice versa. We are trying to model a 
moving target, and an erratically moving one at that. 
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Figure 65.1. The problem in a nutshell: abundances of sage 
sparrows (Amphispiza belli) on survey plots in western shrub- 
steppe predicted on the basis of a model incorporating several 
habitat measures are displayed versus abundances actually 
observed on those plots. The solid circles are the data points 
used to develop the model (1977-1982), and the close fit be- 
tween predicted and observed values (P « 0.01, R? = 0.60) 
suggests that the mode! might have reasonable predictive 
power. The open circles are from surveys conducted in 1997, 
in which the 1997 habitat measures were used in the 
1977-1982 model to generate predictions of 1997 sparrow 
abundances. Clearly, it's not working. 


Species Occurrences 


The intent of wildlife-habitat modeling and analysis, 
in general, is to understand and predict the occurrence 
of species in space and time. The simplest approach 
(at least in terms of data collection) seemingly is to 
record species presence and absence. Quite apart from 
the problem of model efficacy in predicting presence 
and absence based on habitat measures (errors of 
omission and commission, discussed often in this vol- 
ume) there is the problem of interpreting “presence” 
or *absence" in ecologically meaningful terms. If a 
species is present in an area, does that mean that the 
habitat is suitable? Suitable for what? If areas are sur- 
veyed only once (as is often the case), records of pres- 
ence may include transients that don't really *belong 
there." (This is my chance to refer you to Wiens 
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(1981c), a [perhaps justly] long-neglected paper.) Are 
species that are present in few sites actually rare, or 
simply undetectable? Or, if a species is absent from a 
location or habitat, does that mean that it isn't really 
"habitat" for that species, or that it was really there 
but wasn't detected, or that it is normally there but 
was absent when the survey was conducted? I think 
we often assume that these uncertainties will be mini- 
mized if we accumulate a lot of presence-absence ob- 
servations. Even if that is done, however, there is an 
implicit assumption that presence is matched with 
“suitable” habitat and absence with “unsuitable” 
habitat. In other words, habitat occupancy is (with a 
certain amount of *noise") equilibrial. But we know 
from numerous studies that events elsewhere (e.g., on 
the wintering grounds of migratory birds) can affect 
patterns of habitat occupancy, or that there may be 
time lags in the response of a species to a change in 
habitat suitability (e.g., species loss following forest 
fragmentation). We also know (or at least think) that 
habitat occupancy is related to density. The Fretwell- 
Lucas ideal-free distribution (IFD) model, for exam- 
ple, predicts that the range of habitat occupancy will 
increase as regional population density increases, and 
this formulation has influenced much of our thinking 
about habitat occupancy. 

Ah, density! Surely there is no more persistent no- 
tion in the annals of wildlife-habitat analysis than the 
expectation that local population density should be 
positively related to habitat quality. After all, we are 
part of a culture that generally adheres to the premise 
that more is better, so why shouldn't this apply to 
habitat occupancy? The problem is that (at least with 
respect to wildlife and habitat) the premise is both log- 
ically and empirically flawed—as Van Horne (1983) 
pointed out nearly two decades ago, Maurer (1986) 
reinforced, and Garshelis (2000) has forcefully reiter- 
ated. More resources (i.e., better-quality habitat) do 
not necessarily translate into more individuals. It is 
hard to escape from the intuitive notion that abun- 
dance must indicate something important—we just 
aren't sure what it is. 

The issue of habitat occupancy and what it means 
leads to another, more basic concern. Our approach to 
"habitat" is strongly categorical. Although we may 
take all sorts of quantitative measures of habitat vari- 


ables, we tend to do so only to characterize a habitat 
type that has been determined a priori (often on the 
basis of dominant vegetation). Habitat types are re- 
garded as discrete entities, and while we may array 
them in nested hierarchies or use Boolean or fuzzy set 
approaches to define them (see Hill and Binford, 
Chapter 7), they end up being categories nonetheless. 
Tools such as GIS simply reinforce this perspective, for 
what is a habitat map without discrete boundaries on 
the habitat types? Managers like habitat maps because 
their decisions are usually made with reference to 
land-use blocks, which are administratively discrete. 
The assignment of *habitats" to categories also dove- 
tails nicely with categorical forms of statistical analy- 
sis, such as ANOVA (or, less conspicuously, CART). 
Yet, we know that there is substantial variation within 
any defined habitat type, and their boundaries in the 
real world are often gradual and indistinct, especially 
in relatively *natural" environments. Rather than 
thinking discretely about habitat, it might be worth- 
while to think of gradients in habitat conditions, to let 
topographic maps rather than land-cover maps be our 
conceptual guides. Mike Austin (Chapter 5) has pro- 
vided some guidance about how we might do this. 


Scale 


Scale has rapidly become a central issue in ecology. It 
was scarcely mentioned in Wildlife 2000, yet it figures 
prominently in many of the contributions to this vol- 
ume. But an awareness of scale does not necessarily 
translate into effective ways of dealing with it. There 
are three aspects to the *scaling problem." First, it has 
become apparent that virtually everything in ecology 
depends on the scale at which it is viewed. Patterns 
that are obvious at fine spatial scales or over short 
time periods dissolve at broader scales of space and 
time, perhaps to be replaced by startlingly different 
patterns. The associations of sage sparrows with big 
sagebrush (Artemisia tridentata) cover, for example, 
are either positive or negative depending on the spatial 
scale at which they are assessed. Statements about 
habitat relationships or projections of wildlife-habitat 
models are useless unless they are accompanied by 
qualifiers linking them with a particular scale of meas- 
urement or application. 

The second aspect of scaling that is relevant here is 
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how scale should be specified. It has become custom- 
ary to bracket the range of scale encompassed by a 
study (or a model) by grain (the finest scale of resolu- 
tion of the data, generally equivalent to survey-plot or 
pixel size) and extent (the area or region from which 
observations are made or to which a model is ap- 
plied). Grain and extent define the window through 
which we view patterns—by definition, we cannot de- 
tect patterns at scales finer than the grain or broader 
than the extent. When we assess variance or autocor- 
relation in habitat features or species occurrence or 
abundance, the variance measures, like the means (or 
whatever other metrics we derive), apply within but 
not beyond the grain-extent window. The criteria used 
to determine grain and extent vary tremendously 
among studies, depending on their objectives and con- 
straints. This makes comparisons among studies diffi- 
cult, for investigations conducted with different win- 
dows on the world are likely to see different things 
simply because their windows differ in size. More 
problematic, however, are differences in grain and ex- 
tent that occur within a study. It is not unusual, for 
example, to see species surveyed using 50-meter radius 
circular plots, vegetation sampled using 10-square- 
meter quadrats, and GIS layers developed using 30- 
square-meter pixels, and all of them then combined in 
an analysis as if the scales used were the most appro- 
priate, biologically, and as if the differences in scales 
didn’t matter. Perhaps they don’t. Given the pervasive- 
ness of scale dependence in ecological systems, how- 
ever, it would be good to be sure. 

The third problematic aspect of scale has to do 
with extrapolation—what is in the technical literature 
known more cryptically as the transmutation prob- 
lem, which Bob O'Neill (1979) presciently called at- 
tention to decades ago, in another (unjustifiably) neg- 
lected paper. How does one use the results of studies 
conducted at one (grain-extent) scale to draw infer- 
ences or make predictions at other scales? Simply 
stated, one can't. Yet, we are generally constrained to 
conduct ecological studies at scales that are often con- 
siderably finer than their intended or desired scales of 
application. This is surely one of the most vexing, yet 
most critical, problems in applied ecology. Initial ap- 
proaches have been made in both empirical and mod- 
eling studies. Empirically, the favored method seems 


to be to conduct a multiscale investigation in which 
habitat features and species responses are assayed at 
several scales of resolution (i.e., several discrete win- 
dow sizes). Roughly half of the empirical studies re- 
ported in this book used this approach. Although this 
represents an improvement on single-scale studies, 
most such studies consider only a few (generally, two) 
scales, and the scales are selected based on arbitrary 
(and rarely justified) criteria. A few modeling studies 
(e.g., Cogan, Chapter 18; Gross and DeAngelis, Chap- 
ter 40) are exploring the possibilities of multimodel- 
ing, in which models operating at different scales are 
operationally linked. 


Prospects 


Where does this leave us? Certainly with no dearth of 
problems to address. I'd like to conclude, however, by 
drawing attention to several considerations that I 
think lie at the foundation of our efforts to (as the 
book title suggests) predict species occurrences. 

First, scale (again). The problem is not only that 
ecologists deal with a wide array of scales that are 
often incompatible with one another. When the time 
comes to apply the results of these studies to manage- 
ment or policy problems, we are confronted with the 
reality that management and policy are frequently ex- 
ercised at quite different scales from the scales of ecol- 
ogy. Management scales are determined by adminis- 
trative boundaries, land ownership, and resource- 
extraction practices, not ecology, and policy is devel- 
oped at even broader scales. To think that Nature, 
which follows its own scaling rules, can somehow be 
made to fit within the arbitrary scales of management 
and policy is fantasy. 

This relates to my second point: organisms follow 
their own algorithms in responding to “habitat,” and 
this determines the scales at which they operate and 
over which variation in environmental conditions may 
be relevant to them. John Addicott and his students 
recognized the importance of organism-determined 
scaling in their concept of ecological neighborhoods, 
in a paper (Addicott et al. 1987) that has (thankfully) 
not been neglected. To understand scaling relation- 
ships, and indeed to understand anything about 
wildlife-habitat relationships, we need to adopt a view 
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of *habitat" that is centered on how organisms might 
perceive and respond to it, not how humans think of 
it. Obviously, this is not easy to do, and it becomes 
more difficult as one considers organisms that are not 
mammals or birds, but insects or plants. The key, as 
several contributors to this volume have emphasized, 
is to focus on process, on the aspects of physiology, 
behavior, life history, and demography that actually 
produce the patterns we so meticulously document. To 
understand why a model does or does not predict 
well, for example, requires that we ask “why?” This 
comes down to an understanding of process, of mech- 
anisms, rather than a further refinement of patterns. 
Going back to the now-ancient notion of ecological 
niches (e.g., Austin, Chapter 5), developing *envi- 
rograms" (Andrewartha and Birch 1984), or employ- 
ing knowledge-based approaches (e.g., Van Horne, 
Chapter 4) may help us to incorporate process more 
directly into our thinking. 

Thinking about niches may also free us from the 
constraints of viewing habitats as categories. My third 
point, then, is to emphasize the value of viewing ecolog- 
ical systems as arrayed over gradients and considering 
the response functions of species to these gradients, be 
they unimodal, bimodal, or amodal. Environmental 
gradients may steepen in places, which we may then de- 
fine as boundaries (or thresholds, or nonlinearities). It is 
the form of the gradient, and of the species response, 
that will ultimately tell us what we need to know to 
model wildlife-habitat relationships. 

Fourth, it is imperative that we be clear about our 
goals. Why do we conduct a particular study or de- 
velop a certain model? In my own experience, the ob- 
jectives of a research project may be clear at the begin- 
ning and at the end, but along the way they have 
probably undergone a transformation—I prefer to 
think of it as “maturation.” It's important to be clear 
about goals and objectives both within a study and, 
particularly, among studies. A model developed for 
one purpose is not immediately transferable to some 
other purpose, even if it is a “good” model. Investiga- 
tions may be conducted for a variety of purposes and 
at a variety of levels. Many of the contributions to this 
volume, for example, focus on individual species, 
often in only a portion of their geographic ranges. As 
a consequence cf the conservation focus on biodiver- 


sity and policy dictates to implement ecosystem man- 
agement, however, attention is increasingly shifting to 
multispecies investigations. This shift carries with it a 
host of new challenges, not the least of which is recon- 
ciling the multifarious scaling responses of different 
species with one another. (As an aside, I should note 
that there is some disagreement among reputable ecol- 
ogists about the value of investigations of ecological 
communities. John Lawton [1999], for example, has 
concluded that *community ecology is a mess, with so 
much contingency that useful generalisations are hard 
to find," that *community ecology may have the 
worst of all worlds," and that *the time has come to 
move on." On the other hand, Ed Wilson [2000] ob- 
serves that *community ecology . . . is about to 
emerge as one of the most significant intellectual fron- 
tiers of the twenty-first century," and that “it stands 
intellectually in the front rank with astrophysics, ge- 
nomics, and neuroscience.” You decide.) 

Finally, if any of this is to make any difference 
whatsoever, it is essential that the widening gulf be- 
tween scientists and managers be bridged, and bridged 
quickly. Ecologists must learn that the variability, 
complexity, and contingencies of Nature that so en- 
thrall them (Simberloff’s “music”) are-not so appeal- 
ing to managers, who of necessity must seek simple 
solutions. Statements that “it all depends” and calls 
for “more research” may satisfy ecologists, but they 
ring hollow to the manager. At the same time, man- 
agers must come to realize that the systems they wish 
to manage, and the species occurrences they wish to 
predict, are variable, and any conclusions or actions 
are bound to be accompanied by uncertainty (see 
Bradshaw and Borchers 2000). They also should real- 
ize that the arbitrary scales of management they have 
applied in the past may not be appropriate for attain- 
ing goals that relate to ecological systems that scale 
things differently. Progress is being made in addressing 
the issue of how much variability or complexity we re- 
ally need to consider in addressing particular prob- 
lems (e.g., Gross and DeAngelis, Chapter 40), and 
adaptive management is widely touted as an antidote 
for uncertainty. Such approaches need to be tested and 
expanded. Most importantly, however, both ecologists 
and managers need to talk with one another, free of 
their disciplinary defenses and jargon. Wildlife 2000 
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made a good start on this; we need desperately to res- 
urrect it. 


‘Coda 


It's interesting, and of more than passing importance, I 
think, to contemplate from whom we might draw in- 
spiration in dealing with these issues in wildlife-habi- 
tat relationships. G. Evelyn Hutchinson (1959), for 
example, drew his inspiration for contemplating the 
sources of what we now term biodiversity from Santa 
Rosalia. More recently, O'Neill and King (1998) 
adopted St. Michael as the patron for their explo- 
ration of scaling issues. Several other possibilities 
come to mind. Charles Elton developed early thinking 
about ecological niches, laid some of the empirical 
foundation for modern studies of population dynam- 
ics, and in his later years focused increasingly on the 
importance of *habitat." David Lack did much to in- 
fluence our thinking about life histories and evolution- 
ary adaptations in ecology. Robert MacArthur cat- 
alyzed the growth of theoretical ecology during the 
last half of the twentieth century. Through his sensitiv- 
ity and understated elegance, Aldo Leopold gave both 
rigor and passion to wildlife ecology, and thence to 
conservation biology. 


I could go on. I, however, take my inspiration from 
the Australian ecologists Harry Andrewartha and 
Charles Birch. Before ecology became a modern sci- 
ence (i.e., 1954), they recognized that, when all is said 
and done, our central task is explaining the distribu- 
tion and abundance of organisms. The key to doing 
this, they held, lies in an understanding of how organ- 
isms respond to their environments—what we now 
call mechanistic or process-based ecology. These are 
the themes that echo through this book and that must 
underlie our efforts to understand and model wildlife- 
habitat relationships. 
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Breeding distribution, 221 

Breeding range maps, 367-75, 
30.1-30.2, 30.1-30.3 
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Breitenmoser, Urs, 462 

Brewer’s blackbird (Euphagus 
cyanocephalus), 503-4, 580 

Bridger-Teton National Forest, 
499-506, 44.1, 44.3 

Bristlecone pine (Pinus longaeva), 
316, 324, 25.4 

Broad landscape explanatory 
variables, 157-69, 12.3-12.6 

Broad-tailed hummingbird 
(Selasphorus platycercus), 579 

Brown bear (Ursus arctos), 60.1 

Brown creeper (Certhia americana), 
109, 236, 580, 18.1 

Brown-headed cowbird (Molothrus 
ater), 370, 580; Fort Hood, Texas, 
547-58, 49.1—49.5, 49.1-49.2; 
United States, 219-28, cs17.1, 
17.2, 17.1-17.3 

Brown thrasher (Toxostoma refum), 
228 

Bull frog (Rana catesbeiana), 160, 
T2 T 

Bullock's oriole (Icterus bullockii), 
580 

Bull trout (Salvelinus confluentus), 
327-34, 26.2-26.5, 26.1 

Bunnell, F. L., 58 

BUP (best unbiased predictor), 627 

Bureau of Budget National Map 
Accuracy Standards (NMAS), 429 

Bureau of Land Management (BLM), 
134, 242, 476 

Burgman, Mark, 266, 267, 268 

Burn-in time, 448-49 

Burrowing owls (Athene cunicularia), 
64 

Butterflies: Great Basin, 510-17; 
Greater Yellowstone ecosystem, 
500, 502, 504-5 

Butterfly surveys and inventories: 
Great Basin, 511; Greater 
Yellowstone ecosystem, 502 

Butternut (Juglans cinerea), 491-97, 
43.1, cs43.2a—cs43.2b, 43.3, 
43.1-43.3 


Cablk, Mary, 266 

Calibration, 52, 129, 205-6; 
distribution model for lynx, 
654-55; distribution models for 
songbirds, 387-89, 32.1; spatially 
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explicit simulation models, 
447-58, 39.2-39.5, 39.2-39.6 

California Bird Records Committee, 
368 

California black walnut (Juglans 
bindsii), 188 

California Department of Fish and 
Game, 133, 134, 185, 233, 235 

California Gap Analysis Project, 
230-32, 237-39, 368, 375 

California giant salamander 
(Dicamptodon tenebrosus), 238 

California ground squirrels 
(Spermopbilus beecheyi), 90, 6.5 

California quail (Callipepla 
californica), 354, 579 

California slender salamander 
(Batrachoseps attenuatus), 18.1 

California spotted owl (Strix 
occidentalis occidentalis), 241, 
687-700, 61.1-61.3, 61.1-61.2 

California Wildlife Habitat 
Relationship system (CWHR), 184, 
185, 368 

Calling Lake Fragmentation 
Experiment, 560 

Calling Lake Study Area (Alberta), 
559-71, 50.1—50.2, 50.1 

Calliope hummingbird (Stellula 
calliope), 579 

Callipepla gambelii, 650 

Canadian Wildlife Service, 663 

Canopy cover, 687-700, 61.2-61.5, 
61.2-61.5 

Canyon wren (Catherpes mexicanus), 
580 

Capelin (Mallotus villosus), 172 

Cape Sable seaside sparrow 
(Ammodramus maritimus 
mirabilis), 471, 472 

Capture data, 441-45, 38.1 

Cardinalis sinuatus, 650 

Carex curvula, 321, 324, 25.2-25.3, 
25.5, 25.2, 25.4 

Carolina chickadee (Poecile 
carolinensis), 608, 611-12, 613, 
54.2-54.3 

CART (classification and regression 
tree analysis), 32, 33, 60, 273; 
brown-headed cowbird study, 
220-22, 225-27, 17.2; fungal 
habitat model, 486-88, 41.5, 42.1; 
montane meadows study, 502; 


Oregon vertebrate species richness 
study, 434-40, 37.2-37.5, 37.3; 
Southern Appalachian bird species, 
608-12, 614; yellow-billed cuckoo, 
cs54.2 

Cascade Range, 478, 480, 510, 665 

Cassin's finch (Carpodacus cassinii), 
580 

Cassin's vireo (Vireo cassinii), 236, 
18.1 

Category definition in habitat models, 
97-106, 7.1-7.2, 7.1 

Catherpes mexicanus, 650 

CATMOD, 402 

Cattle density, 347, 352, 28.5 

Cattle herd behavior, 549, 553-55 

Causal relationships, 68-69 

Cavity-nesting birds, 117 

CCR (correct classification rate): in 
amphibian habitat models, 161, 
163; in breeding seabird study, 
174, 13.3; in golden eagle nest site 
study, 278; in Maine land bird 
study, 596-601, 53.5 

Cedar waxwing (Bombycilla 
cedrorum), 580 

Cellular automata models, 234, 235 

Census, 51; for Acadian flycatcher, 
400, 401, 402 

Central America, 665 

Central and Southern Florida Project 
Restudy, 471-72 

Central Highlands, Victoria, 
Australia, 80-81, 303-13, cs24.1 

Central Sand Plains District (Wis.), 
400 

Central Valley, Calif., 88, 184 

Cercocarpus ledifolius, 323, 324, 25.4 

Cercyonis oetus, 504 

Chaetodipus intermedius, 650 

Chamise-redshank chaparral, 18.2 

Chamois (Rupicapra rupicapra), 654 

Chanterelle species, 475-81 

Chaparrals, 93 

Chattahootchee National Forest (Ga.), 
608, 54.1 

Cherokee National Forest (Tenn.), 
608, 54.1 

Chestnut-backed chickadee (Poecile 
rufescens), 236, 580, 18.1 

Chestnut-sided warbler (Dendroica 
pensylvanica), 228; Great Lakes 
Basin, 32.1-32.2; Maine, 599, 604, 


53.1-53.3; Wisconsin, 404-5, 
34.1, 34.1-34.2, 34.5-34.7 

China Lake Naval Weapons Center, 
134 

Chipping sparrow (Spizella passerina), 
228, 504, 580 

Chi-square, 55, 56, 3.1; forest 
songbird species models, 402, 34.2 

Chorus frog (Pseudacris triseriata), 
160-61, 163, 12.1-12.2 

Clark’s nutcracker (Nucifraga 
columbiana), 579 

Classical set theory, 98-99 

Classifers, 272-73, 22.1 

Classification accuracy, 273-74; in 
Eurasian badger study, 258-62; in 
green woodpecker habitat study, 
200, 15.2; in Swainson’s thrush 
habitat study, 113, 118, 8.3-8.5 

Classification and regression tree 
analysis. See CART (classification 
and regression tree analysis) 

Classification trees. See CART 
(classification and regression tree 
analysis) 

Clay-colored sparrow (Spizella 
pallida), 228 

Clearwater National Forest (Idaho), 
574 i 

Cleogyne ramosissima, 324, 25.4 

Clethrionomys gapperi, 650 

Clique size, 338 

Coarse resolution environmental 
variables, 8, 10, 37; Mojave 
Desert, 133-39, 10.1-10.2; Orbe 
Valley and Geneva Canton, 
Switzerland, 202 

Coastal oak woodlands, 235, 236, 
238, 18.2 

Coastal scrub, 235, 236, 238, 18.2 

Coast Range of Oregon, 478, 480 

Coenonympha haydenii, 504, 505 

Coenonympha orchracea, 504 

Cohen’s Kappa. See Kappa statistic 

Coleogyne ramosissima, 323 

Collaborative approaches to adaptive 
management, 241-53, 19.4 

Colonized ranges of brown-headed 
cowbirds, 219, 224-25, 227 

Colony size models, 174, 177-78, 
13.4-13.5 

Colorado River, 510 

Coluber constrictor, 650 


Columbia National Wildlife Refuge, 
665 
Columbia River, 510 
Columbia River Basin, 468 
. Colusa, Calif., 196 
Commission for Environmental 
Cooperation, 619 
Common eider (Somateria 
mollissima), 172, 174, 178-81, 
13.1-13.3 
Common merganser (Mergus 
merganser), 579 
Common poorwill (Phalaenoptilus 
nuttallii), 579 
Common raven (Corvus corax), 109, 
S79 
Common snipe (Gallinago gallinago), 
504, 579 
Common yellowthroat (Geothlypis 
trichas), 228; Greater Yellowstone 
ecosystem, 504; Idaho, 580; 
Maine, 598, 600, 53.1-53.3 
Community, 48, 49, 51 
Community ecology, 11 
Community guild, 47 
COMPAS, 78 
Complexity, 51 
Computational ecology, 467-74 
Computer resources: history of, 53-61 
Concordance, 533, 47.1 
Confusion matrices, 273, 275, 429, 
21.2; Australian rare plant species, 
308, 24.2; Belalp/Spring 
Mountains alpine species, 320, 
321, 25.3; Eurasian badger study, 
260, 20.3 
Conservation Biology, 411 
Conservation planning: Great Basin, 
507-17; Gulf Coast, 588-89; 
Pacific Northwest, 710-11. See 
also Management implications 
Conservation Reserve Program (CRP), 
221, 226, 442, 17.1 
Constant-effort mist-netting, 727-28, 
733-34, 64.1—64.3 
Constrained aggregations, 88-89, 673 
Constraint models, 32-33 
Continental Divide, 112, 117 
Continuum concept for plants, 73-74 
Controlling bias, 537-46, 48.1-48.4, 
48.1-48.2 
Cooper's hawk (Accipiter cooperii), 
29 


Cope's gray treefrog (Hyla 
chrysoscelis), 160, 12.1-12.2 

Cordilleran flycatcher (Empidonax 
occidentalis), 579, 730, 64.1-64.3, 
64.1-64.3 

Cormack-Jolly-Seber mark-recapture 
model, 727 

Correct classification rate. See CCR 
(correct classification rate) 

Correlation bias, 539, 544, 545, 48.3 

Costa Rica, 665 

Cottonwood (Populus fremontii), 184, 
187-88 : 

Cougar (Felis concolor), 60.1 

Count survey data, 206-7, 16.1; 
landbird monitoring programs, 
109-11; for northern bobwhite 
ANN model, 347, 349-52; 
Pennsylvania bird species, 625-38. 
See also Survey designs 

Cox statistic, 388-89 

Coyotes (Canis latrans), 152, 11.2 

Cranfield University, U.K., 256 

Crested ibis (Nipponia nippon), 60.1 

Critique for Ecology, A, (Peters), 126 

Cross-validation: brown-headed 
cowbird study, 220-21; Calling 
Lake bird species study, 568, 569; 
Eurasian badger study, 257-58, 
20.3; golden eagle nest site study, 
278; salmonid patch-based study, 
331, 26.1; Swainson's thrush study, 
1145 113, 118 

Crotalus lepidus, 650 

CRP (Conservation Reserve Program), 
221900)69440945:1 

Cuba, 665 

CWHR (California Wildlife Habitat 
Relationship) system, 184, 185, 
368 


D2 techniques for mapping, 281-89 

Daniel Boone National Forest (Ky.), 
491 

Dark-eyed junco (Junco hyemalis}: 
Greater Yellowstone ecosystem, 
503; Idaho, 580; New Mexico, 
642, 644, 650; Pacific Northwest, 
730, 64.1—64.3, 64.1-64.3 

Darwin, Charles, 739 

Data, 74—75; geospatial, 296-98 

DCA (detrended correspondence 
analysis), 135 
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DeAngelis, Donald L., 463 

Death Valley National Park, 134 

Debinski, Diane M., 462 

Decision boundary, 273, 21.1 

Decision-making process, 59, 130, 
131-32; alternative models in, 
207-18, 16.2, 16.1-16.2; model 
validation in, 295 

Deerfield Ranger District, 611, cs54.2 

Deer mouse (Peromyscus 
maniculatus), 442-45, cs38.1, 
38.1-38.2 

Definitions, 43-50, 124 

DEM (digital elevation model): Belalp 
area (Switzerland), 318; Central 
Highlands, Victoria, Australia, 
305, 309; Eastern Broadleaf Forest 
Eco-Province, 160; Great Basin, 
508-9; Great Smoky Mountains 
National Park, 531, 534; Mid- 
Lachian Valley, 75; Mojave Desert, 
133, 135, 138-39; Oregon, 432; 
Spring Mountains (Nevada), 319; 
Wyoming, 485 

Demes, 48, 664-65 

Demographic aggregations, 88 

Demographic monitoring, 727-36 

Demographic organization, 89-93 

Demographic stochasticity, 673, 678, 
684-85 

Density, 51. See also Abundance 

Department of Agriculture (U.S.), 
158, 243, 300, 359, 476, 477 

Department of Defense (U.S.), 134, 
493 

Department of Natural Resources and 
Environment (Australia), 304 

Department of the Interior (U.S.), 243 

Desert pocket mouse (Chaetodipus 
penicillatus), 642, 647, 648, 650 

Desert tortoise (Gopherus agassizii), 
60.1 

Detectability, 272, 357-65, 29.1 

Detrended correspondence analysis 
(DCA), 135 

Dettmers, Randy, 463 

DFA (discriminant function analysis). 
See Discriminant function analysis 

Dickcissel (Spiza americana), 228 

Differential error patterns, 367-75 

Digital Chart of the World, 220 

Digital elevation model. See DEM 
(digital elevation model) 
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Digital line graphs (DLG), 508-9 

Digital terrain model (DTM), 256 

Dipnet surveys, 159-60 

Direct gradients, 74 

Discontinuity of stream-fish 
distributions, 519-27 

Discrepancy measures, 449—50, 
454—56, 39.4 

Discretely defined categories, 97-106, 
V TTE 

Discriminant function analysis, 54-55, 
273, 3.1; in Eurasian badger study, 
256—60, 20.1, 20.1-20.3; in golden 
eagle nest site study, 278; for 
Southern Appalachian bird species, 
608—12, 614; for yellow-billed 
cuckoo, cs54.2 

Discrimination histograms, 384-85, 
S28] 

Disease-resistant stock: butternut 
trees, 491 

Dispersion patterns, 36 

Distichlis spicata, 135 

Distribution, 51 

Distribution modeling, 25-33, 37, 
125-32, 617-23 

Distribution models: Australian rare 
plant species, 303-13; Australian 
vegetation and fauna, 73-82; 
Belalp/Spring Mountains alpine 
species, 315-26; brown-headed 
cowbirds, 221—24, cs17.1, 
17.1-17.2; Calling Lake bird 
species, 559—71, 50.1; Eurasian 
badger, 255-62; forest songbird 
species, 383-90, 399-410; fungi, 
485; lynx, 653-59; magnolia 
warbler, 377-81; montane meadow 
communities, 499-506; New 
Mexico vertebrates, 639-51, 
57.1—57.5, 57.1; northern spotted 
owl, 244—48; sage sparrow, 
285—88; salmonids, 327-34; 
stream fishes, 519-27 

DLG (digital line graphs), 508-9 

Domestic cat (Felis catus), 81 

Double-crested cormorant 
(Phalacrocorax auritus), 172, 174, 
175-81, 370, 13.1-13.5 

Douglas-fir (Pseudotsuga menziesii), 
109, 235, 238, 682 

Downy woodpecker (Picoides 
pubescens), 579 


Drawdown scenario, 722, 
cs63.8—cs63.9, 63.2 

Dreisbach, Tina A., 462 

Driftless District (Wis.), 400 

DIM (digital terrain model), 256 

Dunham, Jason B., 266 

Dusky flycatcher (Empidonax 
oberbolseri), 197, 579, 682-84, 
60.2 

Dwarf shrew (Sorex nanus), 647, 648, 
650 

Dynamic equilibrium model, 1.4 

dy, statistic, 320, 321, 326, 25.3 


EAM (Effective Area Model), 713-25, 
63.3, cs63.9 

Early-stage aggregations, 88, 6.7 

EarthInfo, 347 

Earth Observing System (EOS), 301 

Earth resources satellites, 301 

Eastern bluebird (Sialia sialis), 228 

Eastern Broadleaf Forest Eco- 
Province, 158, 12:1—12.2 

Eastern gray treefrog (Hyla 
versicolor), 160-61, 163-65, 167, 
12.1-12.2, 12.5, 12.7 

Eastern hemlock (Tsuga canadensis), 
530 

Eastern Klamath, cs19.1, 19.2 

Eastern phoebe (Sayornis phoebe), 
228 

Eastern towhee (Pipilo 
erythrophthalmus), 228 

Eastern wild turkey (Meleagris 
gallopavo), 11.2 

Eastern wood-pewee (Contopus 
virens), 228; Maine, 595, 
53.1-53.3; Wisconsin, 404-5, 
34.1, 34.1-34.2, 34.5-34.7 

Ecobeaker, 548 

Ecological Applications, 411 

Ecological attributes, 368, 369, 371, 
30.1, 30.3 

Ecological boundaries, 230 

Ecological fallacy, 293 

Ecological modeling, 430 

Ecological Monograpbs, 411 

Ecological patterns and processes, 
11-42; 294 

Ecology, 411 

Ecoregional analysis: Santa Cruz 
County (Calif.), 230-32, 235, 236, 
18805 CSEE 


Ecosystem, 49, 51 

Ecosystem engineers, 70 

Ecosystem viability analysis (EVA), 
239 

Ectomycorrhizal fungi (EMF), 478-79 

EDA (exploratory data analysis), 
431-36 

Edge, distance to, 404, 34.3-34.4 

Edge characteristics, 141, 152-53 

Edge density, 198, 200, 202 

Edge effects, 722-23 

EER (expected error rate), 331 

Effective Area Model (EAM), 713-25, 
63.3 

Eigenvectors and eigenvalues, 287-88, 
2923 

Elevation data sets, 198, 200, 256; 
Belalp/Spring Mountains alpine 
species, 318, 321, 25.1, 25.3-25.4; 
Boise River and Lahontan Basins, 
3289830 

Elith, Jane, 266, 267, 268 

Elton, Charles, 749 

Eltonian niche, 38, 46 

EMAP (Environmental Monitoring 
and Assessment Program), 219, 
442 

EMAP hexagons, 219-20 

Empirical models, 64,:4.1; for fungal 
habitats, 483-89; spatial 
autocorrelation in, 429-40 

Endangered species, 40, 69, 230, 60.1 

Endangered Species Act (ESA), 241, 
243, 249, 461, 514, 673, 701 

End-user assessments, 294 

Energy state index (ESI), 584, 52.2 

Entity-based representations, 297 

Envelope method of bias control, 541 

Envirogram, 69 

Environmental gradient, 38, 73-82, 
613, 669, cs59.2 

Environmental impact statements, 56 

Environmental Monitoring and 
Assessment Program (EMAP), 219, 
442 

Environmental Protection Agency 
(U.S.), 219, 442, 540 

Environmental stochasticity, 673, 678, 
684-85 

Environmental stratification units 
(ESU), 75-76, cs5.2 

Environmental variables, 39; 
American woodcock, 337, 


27.1-27.2, 27.1-27.2; amphibians, 
160, 12.3-12.6; Australian rare 
plant species, 304—5, 24.3; 
Belalp/Spring Mountains alpine 
species, 318-19; brown-headed 
cowbirds, 220-24, 17.2, 
17.1-17.3; Calling Lake bird 
species, 561-65, 50.2, 50.2-50.4; 
eucalypt species, 76-79; fungi, 
484—85, 42.2; Great Basin 
butterflies, 507-17, 45.1—45.2; 
green woodpeckers, 198; lynx, 
653—54, 656-58, 58.1; magnolia 
warblers, 378-79, 381, 31.3; 
Maine breeding seabirds, 171-81, 
13.1; New Mexico vertebrates, 
639-51, 57.1—57.5; Oregon 
vertebrates, 432-34, 37.1-37.2; 
Southern Appalachian bird species, 
608, 54.1; Swainson’s thrush, 
113-15, 117-19, 8.1-8.2; wood 
thrush, 532-34, 540, 47.1, 48.1, 
48.4; yellow-bellied gliders, 80. See 
also Coarse resolution 
environmental variables; Fine 
resolution environmental variables 

ENVISAT, 301 

EOS (Earth Observing System), 301 

Ephedra viridis, 324, 25.4 

Ephemeral pools, 299-300 

EROS Data Center (USGS), 433, 442, 
509 

Error propagation, 429; in GIS, 
295-96 

Error rates, 578-79, 51.4—51.5 

Errors: differential patterns, 367-75; 
observational, 300; random, 429; 
spatial, 495-96; spread, 389; 
systemic, 429; thematic, 429-30; 
transmutation, 293. See also 
Prediction errors 

Errors of commission and omission, 
15, 275-77, 430, 618-19; in 
American woodcock habitat use 
study, 340; in atlas distribution 
maps, 391-92; in Belalp/Spring 
Mountains alpine species, 320; in 
breeding birds (Calif.) study, 369, 
370-71, 373-74; in breeding birds 
(Idaho) study, 573-80, 51.2-51.5; 
in conservation planning, 367; in 
fungal habitat model, 481; in 
geospatial data, 292; in golden 


eagle nest site study, 278, 280; in 
habitat association models, 
593-94; in habitat suitability 
models, 190, 193-94, 14.2-14.3, 
14.6-14.7; in Maine vertebrate 
study, 419-27; in models from 
atlas (gridded) data, 203; in 
northern spotted owl habitat study, 
246-47; in rare species studies, 
295; in salmonid patch-based 
study, 331-32 

Error sources, 419-20, 488 

ESA (Endangered Species Act), 241, 
243, 249, 461, 514, 673 

ESI (energy state index), 584, 52.2 

Estimation data sets, 129 

ESU (environmental stratification 
units), 75-76, cs5.2 

Ethical issues, 28, 104 

Eucalypt forests, 76-81, 5.5 

Eucalyptus delegatensis, 80-81 

Eucalyptus regnans, 80-81 

Eurasian badger (Meles meles), 152, 
255-62, 20.1, 20.1-20.3 

Eurasian oystercatchers (Haematopus 
ostralegus), 70 

European beech (Fagus sylvatica), 198 

European red fox (Vulpes vulpes), 81 

European sparrowhawk (Accipiter 
nisus), 152, 11.2 

European starling (Sturnus vulgaris), 
580 

European turtle dove (Streptopelia 
turtur), 11.2 

Eurytopes. See Generalists 

EVA (ecosystem viability analysis), 
239 

Evaluation data sets, 318, 320 

Evening grosbeak (Coccothraustes 
vespertinus), 580 

Everglades National Park (Fla.), 450, 
468, 39.1, 39.1 

Everglades-South Florida, 241, 298; 
computational ecology, 467-74; 
spatially explicit simulation 
models, 450-58, 39.1 

Evolutionary computation, 267 

Exotics, 67 

Expected error rate (EER), 331 

Exploratory data analysis (EDA), 
431-36 

Exploratory models, 28 
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Extent, 8, 51, 60, 68. See also Scale of 
observation 


Fallen Leaf Lake, Calif., 35 

False positives and negatives. See 
Errors of commission and omission 

Fauna modeling, 79-82, 5.5 

Federal/nonfederal habitats, 701-12, 
62.5 

Federal Office of Statistics 
(Switzerland), 655 

Federal Office of Topography 
(Switzerland), 655 

FEMAT (Forest Ecosystem 
Management Assessment Team), 
359. 701, 702 

Fertig, Walter, 462 

FHASM (Fort Hood Avian Simulation 
Model), 549-51, 556-58, 49.2, 
49.1 

Field data, 206, 294; Baraboo Hills 
ecosystem (Wis.), 400; Great 
Smoky Mountains National Park, 
530-31; Mojave Desert, 134-35, 
136, 139 

Fielding, Alan H., 267, 268, 293 

Field sparrow (Spizella pusilla), 228 

Field studies: Great Basin, 508; Gulf 
Coast, 589-90; Maine, 419, 421; 
Santa Cruz County (Calif.), 237 

Filter variables, 639-51 

Fine resolution environmental 
variables, 37, 40, 44, 107, 206; 
Mojave Desert, 133-39, 10.1-10.2; 
Orbe Valley and Geneva Canton, 
Switzerland, 198, 202; Santa Cruz 
County (Calif.), 233 

Fir (Abies lasiocarpa), 109, 113 

FIS (Flora Information System), 304 

Fish and Wildlife Service (U.S.), 56, 
101-3, 104, 185, 243, 420, 477, 
480, 491, 663, 701-2 

Fish counts, 523, 46.1 

Fisher (Martes pennanti), 152, 358, 
112 

Fish-habitat relationship studies: 
salmonids, 327—34; stream fishes, 
519-27 

Flammulated owl (Otus flammeolus), 
238, 18.1 

Fleischman, Erica, 462 

Flint Hills, Kans., 300 

Floodplain habitats, 183-96 
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Flora Information System (FIS), 304 

Florida Everglades, 241, 298, 39.1; 
computational ecology, 467-74; 
spatially explicit simulation 
models, 450-58 

Florida manatee (Trichechus manatus 
latirostris), 60.1 

Florida panther (Puma concolor 
coryi), 450, 468, 472 

Florida scrub jay (Aphelocoma 
coerulescens), 679, 60.1 

Foliage height diversity (FHD), 54, 56 

Forecasting models, 28 

Forest Ecosystem Management 
Assessment Team. See FEMAT 
(Forest Ecosystem Management 
Assessment Team) 

Forest inventory data, 559-71 

Forest Service Fungal Survey and 
Manage Team (U.S.), 477 

Forest Service (U.S.), 56, 107-8, 142, 
243, 359, 476, 480, 574, 614, 728 

Forest songbird species, 34.1-34.2; 
Baraboo Hills ecosystem (Wis.), 
399-410; Great Lakes Basin, 
383-90, 32.1, 32.1-32.3 

Forest-stand inventory data, 400, 
406-7 

FORPLAN (forest-planning tool), 205 

Fort Hood, Texas, 547-58, 49.1-49.5, 
49.1-49,2 

Fort Hood Avian Simulation Model 
(FHASM), 549-51, 556-58, 49.2, 
49.1 

Fort Irwin Military Reservation, 134 

Foster Island (Calif.), 185 

Fox sparrow (Passerella iliaca), 580 

Fragmentation, 88-89, 152-53, 294; 
in biodiversity models, 66-67; in 
bird-habitat relationship studies, 
560; in multiple-scale models, 167; 
in parasite models, 222; in species 
distribution models, 402, 403 

FRAGSTATS, 160, 337, 562, 583 

Fraser fir (Abies fraseri), 530 

Fremont National Forest, 728 

French Institut Géographique 
National, 198 

Fruiting shrubs (Rosa spp.), 560 

Fruiting shrubs (Rubus spp.), 560 

Fungal habitat modeling, 475-81, 
41.2 à 


Fuzzy logic, 60, 98, 99-103, 106, 7.2, 
ZS 


Gallatin National Forest, 499-506, 
44.1, 44.3 

Gambel’s quail (Callipepla gambelii), 
352, 354 

GAM (generalized additive models), 
30, 32, 33, 77-79; Australian rare 
plant species, 307, 311, 312, 24.2 

Gap analysis, 461 

GAP (Gap Analysis Program), 28-29, 
66, 420, 420—27, 442, 617, cs55.1. 
See also GAP programs under state 
names 

Garmin GPS navigator, 318 

GARP (genetic algorithm for rule-set 
prediction), 307-8, 620-22, 24.2, 
cs55.1; bias control in models, 
538, 541-42, 546 

Garrison, Barrett A., 268 

Gaussian models, 319—20, 321, 
325-26, 25.2-25.3, 25.2; 
Belalp/Spring Mountains alpine 
species, 316 

Gaussian response functions, 78, 5.4 

Gecko, 548 

Generalists, 593; Maine land birds, 
598, 605; neotropical migrant 
birds, 584, 52.2; Southern 
Appalachian bird species, 608 

Generalized additive models (GAM). 
See GAM (generalized additive 
models) 

Generalized linear models. See GLM 
(generalized linear models) 

Genetic algorithm for rule-set 
prediction. See GARP (genetic 
algorithm for rule-set prediction) 

Genetic algorithms, 267 

Geneva Canton, Switzerland, 
197-204, 15.1, 05253241 5:3, 
15.1-15.2 

Geographers, 8, 44 

Geographic information system. See 
GIS (geographic information 
System) 

Geological Service (U.S.), 368, 393, 
429 

Geological Survey (U.S.), 134, 378, 
420, 433, 461, 509, 582 

George Washington and Jefferson 


National Forest (Va.), 142, 608, 
611, 54.1, cs54.2 

Georgia, 607-15 

Geospatial data in time, 291-302 

G (gamma) statistic, 320, 321-22, 
32692535 

Giant garter snakes (Thamnophis 
gigas), 88-89 

Giant panda (Ailuropoda 
melanoleuca), 60.1 

GIS (geographic information system), 
70-71, 291-302, 461-62, 740; 
Baraboo Hills ecosystem (Wis.), 
400—402; Belalp/Spring Mountains 
alpine species, 316, 326; Boise 
River and Lahontan Basins, 330; 
California, 368, 369; Eastern 
Broadleaf Forest Eco-Province, 
160; Fort Hood, Texas, 548-49, 
552, 49.4—49.5; Greater 
Manchester, U.K., 256, 257; Great 
Smoky Mountains National Park, 
492-93, 531-32; Idaho, 285, 574; 
Klamath Province, 244, 248, 252; 
Maine, 420, 594; Maine (mid- 
coastal) study area, 174; Mid- 
Lachian Valley, 74-75, 77; Mojave 
Desert, 133, 135, 138; New 
Mexico, 639-51; Orbe Valley and 
Geneva Canton, Switzerland, 198; 
Pennsylvania, 336; Sacramento 
River (Calif.), 185, 187; Santa 
Cruz County (Calif.), 233; United 
States, 22.5 

Glen-Colusa Irrigation District 
(GCID), 195 

GLM (generalized linear models), 30, 
77—79, 81, 5.4; Australian rare 
plant species, 305, 307, 311, 
312-13, 24.2; Belalp/Spring 
Mountains alpine species, 319-20, 
321,:325-26, 25.2-2:549825:2; 
Calling Lake bird species, 562-65; 
lynx, 655, 58.2 

GLMM (generalized linear mixed 
models), 626 

Global positioning system (GPS): 
Great Smoky Mountains National 
Park, 493; Mojave Desert, 134; 
Southern Appalachian Assessment, 
609 

Golden-cheeked warbler (Dendroica 
chrysoparia), 399, 547, 549, 555 


Golden-crowned kinglet (Regulus 
satrapa): California, 235, 18.1; 
Idaho, 580; Maine, 594, 595, 598, 
599-600, 605, 53.1-53.3; Rocky 
Mountains, 109 

Golden eagle (Aquila chrysaetos), 
272, 278-80, 370, 21.2 

Gonzalez-Rebeles, Carlos, 462. 

Goodman and Kruskal's G (gamma) 
statistic, 320, 321-22, 326, 25.3 

Goodness-of-fit criterion, 126, 129, 
161, 174, 609; forest songbird 
species models, 402, 34.2 

GPS (global positioning system): 
Great Smoky Mountains National 
Park, 493; Mojave Desert, 134; 
Southern Appalachian Assessment, 
609 

Gradient-based approach to landscape 
structure, 667—72. 

Gradient-based sampling, 17, 65, 
67-68, 4.2 

Gradient of abundance, 93 

Gradsect sampling, 75 

Grain, 8, 51. See also Scale of 
observation 

Grand fir (Abies grandis), 109, 113 

Grand Teton National Park, 499—506, 
44.1 

GRASS GIS software, 549, 552, 
553-54, 49.4—49.5 

Gray catbird (Dumetella carolinensis), 
228, 580 

Gray fox (Urocyon cinereoargenteus), 
152, 154, 11.2 

Gray jay (Perisoreus canadensis), 579 

Gray wolf (Canis lupus), 60.1 

Grazing in aquatic systems, 19-20 

Grazing lands: Belalp area 
(Switzerland), 316; Fort Hood, 
Texas, 553-54; Great Plains, 299; 
Rolling Red Plains, Oklahoma, 
347 

Great Basin, 507-17 

Great black-backed gull (Larus 
marinus), 172, 174, 179-81, 
13.1-13.3 

Great Britain, 255-62 

Great crested flycatcher (Myiarchus 
crinitus), 404-5, 34.1, 34.123412; 
34.5-34.7 

Great Dividing Range (Australia), 
303-13 


Greater bilby (Macrotis lagotis), 60.1 

Greater glider (Petauroides volans), 
79-80, 60.1 

Greater Manchester, U.K., 255-62 

Greater Sage-Grouse (Centrocercus 
urophasianus), 55 

Greater Yellowstone ecosystem, 
499-506 

Great Lakes Basin, 378, 384, cs31.1 

Great Lakes Protection Fund, 384 

Great Lakes-St. Lawrence River Basin, 
241 

Great Plains, 219, 298-301 

Great Plains toad (Bufo cognatus), 
298-301 

Great Smoky Mountains National 
Park, 491-97, 43.1, 
cs43.2a-cs43.2b, cs47.2 

Green frog (Rana clamitans), 160, 
163, 168, 12.1-12.3 

Green-tailed towhee (Pipilo 
chlorurus), 642, 644, 650 

Green woodpecker (Picus viridis), 
197-204 

Grevillea barklyana, 311, 24.1 

GRID calculator, 319 

Grids: Australian map grid (AMG), 
309; EMAP hexagons, 219-20; 
hexagonal, 247-48, 19.3; in 
models for atlas data, 203, 204; 
quadrangle, 369 

Grinnellian niche, 38, 46, 593 

Grizzly bear (Ursus arctos horribilis), 
66, 241, 357, 60.1 

Gross, Louis J., 463 

Ground surveys, 480 

Groundtruthing, 198, 508 

Guilds, 47-48, 51 

Guisan, Antoine, 266-67, 268 

Gulf Coast, 581-83, cs52.1 


Habitat, 9, 45-46, 51, 63, 93-94, 
368, 739-40, 746 

Habitat association, 371-72 

Habitat association models, 431; 
Maine land birds, 593-606; Santa 
Cruz County (Calif.), 230, 232, 
237; vertebrates, 419-27 

Habitat availability, 51 

Habitat avoidance, 51 

Habitat commonality, 392-93, 394, 
39689981 
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Habitat evaluation procedures (HEP), 
56, 59, 69, 480 

Habitat management for population 
persistence under uncertainty, 
210-16 

Habitat maps, 367-75; northern 
spotted owl, 244 

Habitat modeling, 26, 35-41, 53-61, 
63-72; with geospatial data, 
291-302 

Habitat models: alternative models of, 
205-18; category definition in, 
97-106, 7.1-7.8, 7.1; filter 
variables in, 639-51, 57.1-57.5; 
numerical comparisons as basis for, 
83-95, 6.1—6.2, 6.1; validation test 
for, 607-15 

Habitat occupancy models, 211-12; 
breeding seabirds, 171-81, 
13.1-13.3 

Habitat preference, 39-40, 51 

Habitat preference assessments, 59, 
91; for forest songbird species, 
405-10, 34.2, 34.5-34.7 

Habitat quality, 51, 206, 669, 
678-79, cs59.2-cs59.3; yellow- 
billed cuckoo, 183-96 

Habitat resources. See Resources 

Habitat selection, 37-38, 51, 68; 
Eurasian badger, 261; forest 
songbird species, 399—410, 
34.1—34.3, 34.1-34.7; giant garter 
snakes, 88-89; in use of D2, 
282-83 

Habitat specialization, 599; 
neotropical migrant birds, 584, 
59015212 

Habitat suitability index. See HSI 
(habitat suitability index) 

Habitat suitability models, 57, 217, 
298; Great Basin butterflies, 
507-17; neotropical migrant birds, 
581-92; northern spotted owl, 
701-12, 62.1-62.5; sage sparrow, 
285-88, cs22.1—cs22.2, 22.3, 
cs22.4, 22.1-22.3; stream fishes, 
519-27, 46.2, 46.2; yellow-billed 
cuckoo, 183-96, 14.1-14.5, 
14.1-14.7 

Habitat use, 45, 51; American 
woodcock, 335-43 

Habitat variables. See Environmental 
variables 
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Habitat viability analysis (HVA), 239 

Haddock (Melanogrammus 
aeglefinus), 172. 

Haianan Eld's deer (Cervuse eldi), 
60.1 

Hairy woodpecker (Picoides villosus), 
579 

Halibut, 172 

Hall, Linnea S., 124, 368 

Hammond’s flycatcher (Empidonax 
hammondii), 579 

Hampton, Haydee M., 664 

Hartless, Christine S., 266 

Harvard Forest Long-Term Ecological 
Research study area, 103, 104, 7.4 

Haufler, Jonathan B., 664 

Hayden's ringlet butterfly, 504, 505 

HD (Hypothetico-deduction), 27, 
57-58 

Heglund, Patricia J., 26, 28-30, 32, 
613 

Helichrysum scorpioides, 310, 311, 
24.1 

Helmeted honey eater (Lichenostomus 
melanops), 60.1 

Hemlock woolly adelgid (Adelges 
tsugae), 530 

Henebry, Geoffrey M., 266 

HEP (habitat evaluation procedures), 
56, 59, 69, 480 

Hepinstall, Jeffrey A., 369, 463 

Hermit thrush (Catharus guttatus), 
228, 580, 595, 18.1, 53.1-53.3 

Hermit warbler (Dendroica 
occidentalis), 235, 18.1 

Herpetofaunal surveys: Great Plains, 
300-301 

Herptile LOORs, 420 

Heterogeneous landscapes, 713-25 

Hexagon models, 247-48, 19.3; 
EMAP, 219-20 

HGM (hydrogeomorphic method), 59 

Hierarchical spatial units, 664-65 

Hierarchical system organization, 
44-45, 685, 60.5; in stream-fish 
study, 520, 526 

Hill, Kristina E., 28 

Historical Climatology Network, 220, 
434 

Historical habitat, 89 

History of wildlife-habitat modeling, 
53-61 


H. J. Andrews Experimental Forest, 
478 

Hobbs, N. Thompson, 665 

Hollow-bearing trees, 80, 81 

Holthausen, Richard S., 664 

Holt Research Forest (Me.), 421, 
36.1, 36.3, 36.1-36.3, 36.4 

Home range, 51, 90, 679-80, 684-85, 
60.1, cs60.2, 60.3—60.4; spotted 
owl, 692-93, 61.2 

Home range suitability, 153, 154 

Hooded warbler (Wilsonia citrina), 
152, 399, 11.2 

Host abundance, 221-22, 224-26, 
17.2-17.3 

Host Abundance Index, 221-22, 224 

House sparrow (Passer domesticus), 
580 

House wren (Troglodytes aedon), 580 

HSI (habitat suitability index), 28, 29, 
56, 57, 59, 69; black-capped 
chickadee, 101—3, 104, 7.3; 
chanterelle species, 480; 
neotropical migrant birds, 584; 
yellow-billed cuckoo, 187-95, 
14.1-14.7, 14.1-14.5 

Humboldt, Alexander von, 739 

Humpbacked response curve. See 
Unimodal diversity-productivity 
curve 

Hunsaker, Carolyn T., 664 

Hurlbert, S. H., 58 

Huston, Michael A., 28, 29, 30-33 

Hutchinson, G. Evelyn, 749 

Hutto, Richard L., 26-27, 28, 358 

HVA (habitat viability analysis), 239 

Hydrogeomorphic method (HGM), 59 

Hydrologic models: Everglades-South 
Florida, 298, 467-74 

Hypothesis testing, 58-59, 465 

Hypothetico-deduction (HD), 27, 
57-58 


Iberian lynx (Lynx pardinus), 60.1 

ICBM (individual-based cowbird 
behavior model), 551-58, 
49.3—49.5, 49.2 

Idaho, 108, 282, 285-88, 574-75, 
IST 

Idaho Gap Analysis Project, 419, 574 

Idaho Panhandle National Forest, 574 

Idaho Southern Batholith, 680-84, 
60.1, cs60.2, 60.3—60.4, 60.2 


Idrisi, 142 

IFIM (instream flow incremental 
methodology), 522 

Ikonos, 299, 301 

Illinois Natural History Survey, 160 

IMAGINE software, 442 

Incidence of brown-headed cowbirds, 
220-22 

Independent data sets, 274; for 
Acadian flycatcher, 402; for atlas 
distribution maps, 392, 33.2; for 
Great Basin butterflies, 508, 
513-14; for large-landscape scale 
studies, 246; for model-based 
maps, 368; for Southern 
Appalachian bird species, 609, 
54.3; for species distribution 
models, 257, 384 

Indiana, 221 

Indian IRS LISS-II, 500 

Indigo buntings (Passerina cyanea), 
66, 228 

Indirect gradients, 74 

Individual-based models, 11-12, 70, 
266, 472 

Individual-based simulation models: 
Everglades-South Florida, 298, 
447-58, 39.2-39.5, 39.2-39.6; 
Fort Hood, Texas, 551-58, 
49.3—49.5, 49.2; Gulf Coast, 
584-86, 590-91, 52.3 

Individual habitat requirements, 
66-67 

Individual patch, 669—70, 59.1 

Induction, 126 

Inductive bias, 538-39, 545 

Inference, 294—95, 632 

Institute for Bird Populations, 728 

Instream flow incremental 
methodology (IFIM), 522 

Integrated multisensor analyses, 302 

Interior Coast, cs19.1, 19.2 

Interior Columbia Basin Ecosystem, 
241 

Intermediate disturbance hypothesis, 
T9 

Interpopulation density, 91-92 

Intrapopulation density, 90—91 

Invaded ranges, 67, 88; brown-headed 
cowbirds, 219-28 

Invasive brood parasites, 219—28 

Inventories of species, 65, 66-67, 419 


Inventory and Management Division 
(GSMNP), 531 

Island biogeography, 14, 67, 667, 
59.1 

Island occupancy models, 171-81, 
13.1-13.3 

Island of Mull (Scotland), 278-80 


Jackknife procedures, 274, 545, 
609—11, 54.2; estimator, 502, 44.2 

Jack pine (Pinus banksiana), 37 

Jenks, Jonathan A., 267 

Jepson Central West ecoregion, 235, 
238, cs18.2 

JMP IN software, 135 

Johnson, Catherine M., 124, 272, 293 

Johnson, Douglas H., 267 

jones, Malcolm T., 267 

Journal of Wildlife Management, 
53-56, 58, 3.1 

Journals, 60. See also names of 
individual journals 

Jura Mountains, France, 203 

Jura Mountains, Switzerland, 653-59, 
58.1, cs58.2 


Kappa statistic, 277, 324; in 
Belalp/Spring Mountains alpine 
species, 320, 321, 326, 25.3; in 
golden eagle nest site study, 278; in 
green woodpecker habitat study, 
199, 203; in vegetable distribution 
study, 135 

Karl, Jason W., 295, 358, 463 

Kendall’s tau, 135; in stream-fish 
study, 526, 46.2 

Kentucky, 221, 491 

Kentucky warbler (Oporornis 
formosus), 228 

Kern River Preserve (Calif.), 188 

k-fold partitioning, 274 

Kirtland's warbler (Dendroica 
kirtlandii), 37, 228 

Klamath Mountains, 93 

Klamath Province, 243, cs19.1, 19.3, 
19.1 

Klute, David S., 266, 292 

KNN (k-nearest neighbor), 332 

Knowledge-based models, 68-69, 124, 
206 

Kopta Slough (Calif.), 195 

Kriging, 626-28 

Krohn, William B., 124, 267, 272 


Kruskal-Wallis test, 55, 503, 3.1 


Lack, David, 749 

Lahontan Basin (Nevada), 329-34, 
26.2-26.4 

Lahontan cutthroat trout 
(Oncorhynchus clarki henshawi), 
327-34, 26.2-26.4 l 

Lake County (Minn.), 414, cs35.3 

Lampropeltis getulus, 650 

Landbird monitoring programs, 
107-8, 8.1-8.2 

Land birds in Maine, 593-606 

Land cover: Greater Manchester, 
U.K., 256, 257, 260-61; Orbe 
Valley and Geneva Canton, 
Switzerland, 197—204, cs15.2, 
15.1; Santa Cruz County (Calif.), 
230, 232; United States, 220, 222 

Land-cover classes, 669; Oregon, 431, 
cs37.1; South Dakota, 442, 38.1 

Land-cover mapping and validation: 
Maine, 594-96, 53.2-53.5; 
Sacramento River (Calif.), 185-87, 
189-95, 14.2-14.5, 14.2-14.4 

Land-cover trends analysis (LCTA), 
548 .— 

Land-cover variables: for Monte 
Carlo simulation experiment, 378; 
San Pedro River (Ariz.), cs63.1, 
cs63.4—cs63.9; St. Louis and Lake 
Counties (Minn.), 414-15, cs35.3 

Landform index (LFI), 532 

Land management scenarios, 704, 
62.2 

Landsat satellite imagery, 301; Fort 
Hood, Texas, 549, 553; Greater 
Manchester, U.K., 256, 257; Gulf 
Coast, 582; Maine, 595; Nebraska, 
299; Sierra Nevada, 687—700, 
61.2-61.5, 61.2-61.5; South 
Florida, 298 

Landscape, 8, 45, 49, 51, 124 

Landscape feature, 51 

Landscape level, 49, 59 

Landscape-level metrics, 141-55, 
11.1-11.2; Gulf Coast, 583-84, 
590-91, cs52.1, 52.2-52.4, 
52.1-52.4 

Landscape pattern analysis: Baraboo 
Hills ecosystem (Wis.), 399—410, 
34.1—34.3, 34.1-34.7; Eastern 
Broadleaf Forest Eco-Province, 
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160; Fort Hood, Texas, 557-58; 
Gulf Coast, 581-92; Klamath 
Province, 248; San Pedro River 
(Ariz.), 718-25; Santa Cruz 
County (Calif.), 230, 233-34, 237, 
18.1, 18.2 

Landscape structure, 667-72, 59.1, 
cs59.2—cs59.4, 59.1 

Land use planning, 28; Great Basin, 
507-17; Santa Cruz County 
(Calif.), 229-39, cs18.3 

Large-landscape scale studies, 241-53 

Large-scale studies, 44, 57, 59 

Lark sparrow (Chondestes 
grammacus), 228 

Las Vegas, Nevada, 316, 25.1 

Late-successional reserves, 242, 
cs19.1, 19.3, 19.1-19.2 

Lavender thistle (Cirsium 
neomexicanum), 515 

Lazuli bunting (Passerina amoena), 
580 

LCTA (land-cover trends analysis), 
548 

Leadbeater's possum (Gymnobelideus 
leadbeateri), 60.1 

Learning, 347 

Least flycatcher (Empidonax 
minimus), 599—600, 53.1-53.3 

Least-square model (LS), 317-18 

Leopard frog (Rana pipiens), 160-61, 
163, 12.1-12.2 

Leopold, Aldo, 739, 749 

Leptospermum grandifolium, 311, 
24.1 

Level, 44-45, 51, 126 

Level of organization, 8, 51, 124 

Lewis’s woodpecker (Melanerpes 
lewis), 579 

LFI (landform index), 532, 533, 534 

Liebig’s Law of the Minimum, 12-16, 
hel 

Light-footed clapper rail (Rallus 
longirostris levipes), 60.1 

Likelihood Of Occurrence Ranks 
(LOORS), 270, 420-27, 
36.2-36.3, 36.3-36.4 

Limber pine (Pinus flexilis), 316, 323, 
324, 25.4 

Limiting frequency, 99 

Limiting resources, 12-16, 50, 
1.1-1.2 

Lincoln's sparrow (Melospiza 
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lincolnii): Greater Yellowstone 
ecosystem, 504; Idaho, 580; New 
Mexico, 642, 644, 650; Pacific 
Northwest, 731, 64.1-64.3, 
64.1-64.3 

Linear regression models: Great Basin 
butterflies, 508, 511, 513. See also 
GLM (generalized linear models) 

Linear regression, simple and 
multiple, 28, 29, 38, 41, 55, 110, 
3.1; breeding seabird model, 174; 
brown-headed cowbird study, 220; 
northern bobwhite model, 348-49, 
28.2, 28.1-28.2 

Local diversity, 10, 36 

Local experts, 233 

Local extinction, 300-301 

Locally Weighted Sums of Squares 
(LOWESS), 77 

Local scale, 37-38 

Location-based representations, 297 

Lodgepole pine (Pinus contorta), 109 

Logic of category definition, 99-101 

Logistic regression models, 28, 29-30, 
55, 58, 67, 620, 3.1; for American 
woodcock habitat use, 337-42; for 
amphibian habitat use, 161, 163; 
for breeding seabird study, 174, 
180; for butternut trees, 494, 
43.2-43.3; for forest songbird 
species studies, 383-90, 399-410; 
for fungal habitat model, 485-89, 
41.3-41.4, 42.1; for Great Basin 
butterflies, 508, 509, 45.2; for 
green woodpecker habitat study, 
198-99, 202, 15.3; for Monte 
Carlo simulation experiments, 
377-81, 414-18, 35.5; for 
northern spotted owl habitat study, 
245; for salmonid patch-based 
study, 330, 332; for Southern 
Appalachian bird species, 608-10, 
612, 613-14; for Swainson’s 
thrush habitat study, 110-13, 115, 
8.1-8.2; for wood thrush bias- 
control study, 541, 546; for wood 
thrush habitat study, 532-33, 47.1; 
for yellow-billed cuckoo, cs54.2 

Logit, 541 

Log-linear analysis, 55, 56, 3.1 

L-O-O (Leave-One-Out), 274, 331, 
565, 26.1 

LOORS (Likelihood Of Occurrence 


Ranks), 270, 420-27, 36.2-36.3, 
36.3-36.4 

LOSSURVIV, 727-28, 730, 732-33, 
736 

Louisiana, 582 

Louisiana State University, 135 

Louisiana waterthrush (S. motacilla), 
228 

Lower Keys marsh rabbit (Sylvilagus 
palustris hefneri), 60.1 

LOWESS (Locally Weighted Sums of 
Squares), 77 

Low sagebrush (Artemesia arbuscula), 
501 

Loxia curvirostra, 650 

LS (least-square) models, 317-18 

Lupo, Thomas, 268 

Lusk, Jeffrey J., 267, 268 

Lycaena heteronea, 504 

Lynx (Lynx lynx), 653-59, 58.1, 
cs58.2, 58.2-58.3 


MacArthur, Robert, 739, 741, 749 

MacGillivray’s warbler (Oporornis 
tolmiei): California, 236, 18.1; 
Idaho, 580; Pacific Northwest, 
731, 736, 64.1—64.3, 64.1—64.3 

MacKenzie vegetation scheme, 
530-31 

Macrohabitat, 45 

Macroinvertebrates, 159 

Magnolia warbler (Dendroica 
magnolia): Great Lakes Basin, 
377281. 053 1:15 31:32-31:73. 
31.1312, 32.1-32.2; Maine, 598, 
53.1-53.3 

Mahalanobis D2, 281-84, 43.3; Great 
Smoky Mountains National Park, 
492-97, cs43.2a-cs43.2b, 
43.1-43.3; for Southern 
Appalachian bird species, 608-10, 
612, 613-15; for yellow-billed 
cuckoo, cs54.2 

Maine Amphibian and Reptiles, 420 

Maine Breeding Bird Atlas (MBBA), 
419-20 

Maine Gap Analysis Project, 420, 
425, 593-606, 621-22, 36.1, 36.1, 
53.1-53.5, 53.3 

Maine land birds, 593-606 

Maine (mid-coastal) study area, 
171-81, 13.1 


Maine vertebrates, 419—27, 
36.1—36.3, 36.1-36.4 

Mallard Anas (platyrhynchos), 579 

Mammalian carnivora, 89 

Management for population 
persistence under uncertainty, 16.2, 
16.1-16.2 

Management implications, 28, 41, 
465; ANN (Artificial neural 
network) models, 354—55; bias 
control in models, 546; 
biodiversity conflict analysis, 
238—39; bird-habitat relationship 
studies, 570-71; comparison of 
modeling techniques, 614-15; 
Effective Area Models, 724-25; 
forest songbird species study, 410; 
fungal habitat study, 488-89; 
gradient-based approach to 
landscape structure, 671-72; Great 
Basin butterflies study, 517; island 
occupancy models, 181; model- 
based maps, 374—75; models from 
atlas (gridded) data, 204; montane 
meadows study, 505-6; multiple- 
scale models, 168; patch-based 
models, 333-34; regional-scale 
models, 117, 119; spatial scale, 
154—55; stream-fish study, 527; 
vegetation maps, 139 

Management models, 54, 56, 59, 
60-61; defensibility of, 104-6 

Manager-scientist interactions, 242, 
246-47, 19.4 

Mann-Whitney U-test, 160, 163, 308, 
69S 123 

Mann-Whitney-Wilcoxon test, 387 

Manomet Center for Conservation 
Sciences, 598, 600, 53.3-53.4 

Mantel's test, 565, 568 

Map accuracy, 295, 367-75, 
30.1-30.2, 30.1-30.3 

Mapping species’ habitats, 40, 
281-89, 367-75; fungi, 483-89; 
magnolia warbler, 377-81; 
northern spotted owl, 244; 
Pennsylvania bird species, 628-38, 
56.2—56.3, cs56.4—-cs56.6; wood 
thrush, 533; yellow-billed cuckoo, 
185-87 i 

Maps, 8, 44 

MAPS (Monitoring Avian Productivity 
and Survivorship), 728, 729, 735 


Marbled murrelet (Brachyramphus 
marmoratus), 357-65, 29.1, 
29.1-29.2 

Marine Corps Air Ground Combat 
Center, 134 

Markov chain Monte Carlo (MCMC), 
630-38, 56.1—56.3, c$56.4—cs56.6 

Mark-recapture models, 727-36 

Marsh wren (Cistothorus palustris), 
352, 370 

Marten (Martes americana), 153, 11.2 

Maryland Biological Stream Survey, 
587. 

MAUP (modifiable areal unit 
problem), 293, 740 

Maurer, Brian, 124 

Max, T. A., 363 

Maximum likelihood estimators, 363 

McFadden’s Rho-squared value (p2), 
161, 163, 174 

McKenney, Dan W., 266 

McNemar test, 596, 53.1-53.3 

Meadow vole (Microtus 
pennsylvanicus), 442-45, cs38.2, 
38.1-38.2 

Measurement sensitivity, 7-11 

Mechanistic models, 127-28, 130, 
483 

Megapopulation, 92-93 

Melanerpes uropygialis, 650 

Mentzelia pumila, 42.1—42.5, 
42.1-42.2 

Merchant, James W., 266 

Merrills Landing, Calif., 185, 190 

Mesohabitat, 45, 51 

Meta-analysis, 20 

Metapopulation, 51, 510, 665, 
670-71, 685, cs59.4, 59.1 

Metapopulation models, 128 

Mexican spotted owl (Strix 
occidentalis lucida), 65 

Michaelis-Menten model, 511 

Microhabitat, 45, 51 

Mid-Lachian survey, New South 
Wales, Australia, 75-76, 5.1, cs5.2, 
$us 

Migratory birds, 581-92 

Migratory fishes, 327-34 

Minimum dynamic area, 195 

Minimum mapping unit (MMU), 197, 
200—202, 204, 234 

Mink frog (Rana septentrionalis), 
160361., 12.1 


Minnesota, 158, 12.1, cs35.3 

Minnesota Land Management 
Information Center, 160 

Misclassifications, 273, 277; cost of, 
275; Eurasian badger study, 
259—60; sagebrush community, 
503 

Mississippi, 221, 582 

Mist-net data, 727-28, 733-34, 
64.1-64.3 

MMU (minimum mapping unit), 197, 
200-202, 204, 234 

MOAB (Model of Animal Behavior), 
548 

Model, 51 

Model applicability, 320 

Model-based maps, 367-75 

Model evaluation, 123, 744 

Modeling tools, 266-67, 740-41, 743 

Modeling variable populations in 
space and time, 126-29 

Model of Animal Behavior (MOAB), . 
548 

Model selection, 464 

Model validation, 294-96 

Moderate resolution imaging 
spectroradiometer (MODIS), 301 

Modifiable areal unit problem 

- (MAUP), 293, 740 

MODIS (Moderate resolution imaging 
spectroradiometer), 301 

Modoc Zone, cs19.1, 19.2 

Mohave Desert Ecosystem Program, 
134 

Mohua (Mohoua ochrocephala), 60.1 

Mojave Desert, 93, 133-39, 316 

Mojave National Preserve, 134 


Monitoring Avian Productivity and 


Survivorship (MAPS), 728, 729, 
735 

Monitoring programs, 107-8, 
8.1-8.2; Pacific Northwest bird 
species, 727-36; white-tailed deer, 
451-52, 39.1 

Montaine hardwoods, 235, 238, 18.2 

Montana, 108, 112 

Montane hardwood-conifer, 687 

Montane meadow communities, 
499—506, 44.1-44.3 

Monte Carlo analyses, 295 

Monte Carlo simulation experiment, 
377-81, 414, 35.4 

Moore, Gordon, 740 
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Moosehorn National Wildlife Refuge 
(Me.), 36.1, 36.3, 36.1-36.2, 36.4 

Moraines District (Wis.), 400 

Moran’s I correlograms, 337-39, 
27.1-27.2; Oregon vertebrate 
species richness study, 434—40, 
37.2-37.5 

Morrison, Michael L., 124, 368, 740 

Mosaics, shifting, 195-96 

Mountain big sagebrush (Artemesia 
tridentata), 501 

Mountain bluebird (Sialia 
currucoides), 580 

Mountain brushtail possum 
(Trichosurus caninus), 80 

Mountain bushtail possum 
(Trichosurus caninus), 60.1 

Mountain chickadee (Poecile 
gambeli), 579 

Mount Baker National Forest, 728 

Mount Desert Island (Me.), 421, 36.1, 
36.3, 36.1-36.3, 36.4 

Mourning dove (Zenaida macroura): 
Idaho, 579; Pennsylvania, 636, 
cs56.4—cs56.6 

MRPP (multiple response permutation 
procedures), 55, 3.1 

Mt. Graham red squirrel 
(Tamiasciurus hudsonicus 
grahamensis), 260, 399 

Muir, John, 739 

Multicollinearity, 77, 199, 293, 370 

Multifactor response reversals, 19-21 

Multimodeling, 467-74 

Multiple response permutation 
procedures (MRPP), 55, 3.1 

Multiple-scale models: amphibians, 
157-69 

Multiple-scale models, assessment of, 
128-29, 131; amphibian habitat 
models, 157-69; biodiversity 
conflict analysis, 229-39 

Multiple species, assessing viability of, 
682-84 

Multi-resolution Land Characteristics 
Consortium, 442 

Multiscale analyses, 126, 229-39 

Multiscale modeling, 128-31, 9.1 

Multivariate analyses, 54-56, 58, 59, 
3.1; for forest songbird species 
models, 402 

Museum collections data, 537-38, 
546 


860 Index 


Nardus stricta, 321, 324, 25.3, 25.4 

NASA, 301 

Nashville warbler (Vermivora 
ruficapilla): Great Lakes Basin, 
387, 32.1-32.2; Idaho, 580; 
Maine, 595, 53.1-53.3 

National Environmental Policy Act of 
1969, 56 

National Forest Management Act 
(NFMA), 243, 249, 673 

National Map Accuracy Standards 
(NMAS), 429 

National Oceanic and Atmospheric 
Administration (NOAA), 433, 540 

National Park Service (U.S.), 420, 
492, 530 

National Science Foundation, 619 

National Vegetation Classification 
System, 134 

National Wetland Inventory (NWI), 
158, 160, 442 

Natural Heritage programs, 233, 492 

Natural history studies, 25-26, 27, 
40-41, 53, 54, 69, 294 

Natural Resource Conservation 
Service (U.S.), 300 

The Nature Conservancy, 318, 400, 
432 

NDVI (normalized difference 
vegetation index) satellite imagery, 
433, 595 

Nebraska, 300; climate variables, 299, 
23.1; soil types, 300 

Nebraska State Museum, 301 

Needle-and-thread (Stipa comata), 
441 

Neotropical migrant birds, 529, 
581-92 

Nesowadnehunk Field, Baxter State 
Park (Me.), 36.1, 36.3, 36.1-36.2, 
36.4 

Nest count data, 174 

Nest parasitism, 556 

Network architecture, 347-48 

Neural network models. $ee ANN 
(Artificial neural network) models 

Neurobiology, 346 

Neurons, 346 

New Mexico Gap Analysis Program, 
639-51 

New South Wales, Australia, 75, 5.1 

New Zealand, 73-.82 


Neyman-Pearson hypothesis testing, 
432 

Nez Perce National Forest (Idaho), 
574 

Niche, 46-47, 52, 55, 81 

Niche-gestalt, 35, 55, 67 

Niche modeling, 18, 35, 37-38, 
67-68, 620-22; controlling bias in, 
537-46 

Niche width, 9, 369, 505, 593-606 

NMI (normalized mutual 
information), 256, 277, 320 

NOAA (National Oceanic and 
Atmospheric Administration), 433, 
540 

Noon, Barry, 664 

North American Amphibian 
Monitoring Program (NAAMP), 
158 

North American Biodiversity 
Information Network, 619 

North American Breeding Bird Survey. 
See BBS (Breeding Bird Survey) 

North American Truffling Society, 480 

North American waterfowl, 242 

North Dakota, 221, 391-97 

Northerly bias to bird movement, 
SEOS 

Northern alligator lizard (Elgaria 
coerulea), 18.1 

Northern bobwhite (Colinus 
virginianus), 345-55, 28.1—28.5, 
28.1-28.2 

Northern cardinal (Cardinalis 
cardinalis), 228, 556 

Northern flicker (Colaptes auratus), 
S72 

Northern goshawk (Accipiter gentilis), 
89, 579 

Northern parula (Parula americana), 
594, 53.1-53.3 

Northern Region Landbird 
Monitoring Program (USFS), 107, 
118, 574 

Northern Region (USFS), 574-75, 
Sorel 

Northern spotted owl (Strix 
occidentalis caurina), 65; Klamath 
Province, 241-53, 19.2, 19.5, 
19.1-19.2; Pacific Northwest, 478, 
701—125-cs562:19/62:2, 6862 3 
62.1-62.2; Sierra Nevada, 
687—700, 61.1-61.3 


Northern waterthrush (Seiurus 
noveboracensis), 580 

North Maine Forestlands, Moosehead 
Lake Area, 36.1, 36.3, 36.4 

North Maine Forestlands Study, 
Moosehead Lake Area, 36.1-36.2 

Northwest Forest Plan, 242, 243, 244, 
248, 251, 475, 701 

Norway spruce (Picea abies), 198 

Nothofagus cunninghamii, 24.1 

NRF (nesting, roosting, and foraging) 
habitats, 702, 704 

Null hypothesis. See Errors of 
commission and omission 

Numerical comparisons as basis for 
habitat models, 83-95, 6.1-6.2, 
6.1 


Oak (Quercus alba), 402 

Oak (Quercus borealis), 402 

Ochotona princeps, 650 

Oklahoma, 301, 345-47, 354-55 

Oklahoma Department of Agriculture, 
347 

Oklahoma Department of Wildlife 
Conservation, 346, 355 

Olive-sided flycatcher (Contopus 
cooperi), 109, 236, 579, 18.1 

Olympic Peninsula (Wash.), 701-12, 
cs62.19162:29656923 

Omernik ecoregions, 221 

O'Neill, Bob, 747 

Orange-crowned warbler (Vermivora 
celata), 580 

Orbe Valley, Switzerland, 197-204, 
15.1,:6905.2..1.5:2 9 0 800800602 

Ordinal regression models, 315-26, 
25.5, 25.2=25.4 

Ordinal scales, 315, 316-18, 25.1 

Ordnance Survey (U.K.), 256 

Oregon, 432-40, cs37.1 

Oregon Mycological Society, 480 

Oregon State University, 480 

Osprey (Pandion haliaetus), 579 

Outliers, 273 

Ovenbird (Seiurus aurocapillus), 152, 
228, 11.2; Maine, 595, 53.1-53.3; 
Southern Appalachian Assessment, 
608, 612, 613, 54.2-54.3; 
Wisconsin, 404-5, 34.1, 
34.1-34.2, 34.5-34.7 

Overfitting, 274, 348 

Overtraining, 347 


OWL, 703 
Owl surveys, 243 
Oxalis magellanica, 24.1 


‘Pacific Coast (U.S.), 358-59 

Pacific giant salamander 
(Dicamptodon ensatus), 18.1 

Pacific golden chanterelle 
(Cantharellus formosus), 479 

Pacific Northwest, 363, 475-81, 
701-12, 727586 

Pacific Northwest Research Station, 
480 

Pacific Seabird Group, 358 

Pacific-slope flycatcher (Empidonax 
difficilis), 730, 64.1-64.3, 
64.1-64.3 

Pacific Southwest Research Station 
(USDA), 244 

Painted bunting (Passerina ciris), 228 

Palouse agriculture lands, 574 

Paradigm shifts: to adaptive 
management, 251; in distributional 
modeling, 30-33; with remote 
sensing imagery, 301 

Parameterization of models, 52, 
205-6 

Parasites: brown-headed cowbirds 
(Fort Hood, Texas), 547-58, 
49.1—49.3, 49.1; brown-headed 
cowbirds (United States), 219-28, 
el AZT 

Parrots, 276, 278 

Partitioning, 274, 278; in use of D?, 
283-84 

Passerine bird species, 85 

Patch, 52. 

Patch-based models: salmonids, 
327-34; yellow-billed cuckoo, 
183-96 

Patch cohesion, 233-34, 235, 237 

Patch/matrix approach, 667-68, 59.1 

Patch structuring: Boise River and 
Lahontan Basins, 328-33, 
26.3-26.4; San Pedro River (Ariz.), 
718-25 

PATREC (pattern recognition models), 
$5867" 60 

Pattern-process relationships, 430-32 

PCA (principal components analysis), 
55, 3.1; in amphibian habitat 
models, 160; in breeding seabird 
study, 174; in Eurasian badger 


study, 256, 257, 20.1-20.3; in 
Oregon vertebrate species richness 
study, 433 

PC-ORD, 135 

Pearce, Jennie L., 268 

Pearson, Scott M., 463 

Pearson regression residuals, 338, 
3421] 29 72 22087 

Pearson's correlation coefficient, 160, 
174, 175, 199, 401, 692, 61.4 

Pearson’s planes of closest fit, 284-89 

Pectoral sandpipers (Calidris 
melanotos), 70 

Pemaquid Point, Maine, 172, 13.1 

Penguins, 66 

Pennsylvania, 336-37, 27.1-27.2 

Pennsylvania Cooperative Fish and 
Wildlife Unit, 336 

Pennsylvania Game Commission, 336 

Pennsylvania Gap Analysis project, 
337 

Peregrine falcon (Falco peregrinus), 
272, 60.1 

PERMAKART model, 319 

Peromyscus pectoralis, 650 

Persian fallow deer (Dama dama 
mesopotamica), 60.1 

Persistence, 509-10; Apache silverspot 
butterfly, 515-16 

Perturbation tests, 640-41, 647, 57.1 

Petersham, Mass., 102 

Peterson, Townsend, 462, 463 

Petit Manan National Wildlife Refuge 
(Me.), 36.1, 36.3, 36.1-36.2, 36.4 

Peucedramus taeniatus, 650 

Pheasant (Phasianus colchicus), 11.2 

Phebalium bilobum, 311, 312, 24.1 

Phenomenological models, 127-28 

Physiological niche, 18 

Phytosociology, 315, 25.1 

Piedmont National Wildlife Refuge 
(PNWR), 210-11 

Pig’s ear (Gomphus clavatus), 41.1 

Pileated woodpecker (Dryocopus 
pileatus), 11.2; California, 236, 
18.1; Idaho, 579, 682-84, 60.2; 
Rocky Mountains, 109 

Pine Creek Wildlife Area (Calif.), 185, 
188, 190, 194-95 

Pine grosbeak (Pinicola enucleator), 
580 

Pine (Pinus spp.), 530 
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Pine siskin (Carduelis pinus), 236, 
580, 18.1 

Pisgah National Forest (N.C.), 491 

The Plants Database, 134 

Platte River, 66 

Plucea sericea, 135 

Plumbeous vireo (Vireo cassinii), 580 

PM (patch/matrix approach), 667-68, 
59:1 

Pocket gophers (Thomomys spp.), 88, 
6.7 

Point-count surveys, 400, 560, 574, 
598, 54.2; Southern Appalachian 
Assessment, 607-8 

Poisson regression models, 411-18, 
35.2, 35.5-35.6; Belalp/Spring 
Mountains alpine species, 316, 
319-20, 321, 325-26, 25.2-25.3, 
25.2; Calling Lake bird species, 
563—64, 568; greater glider, 80; 
Pennsylvania bird count data, 
628-36, 56.2—56.3, cs56.4—cs 56.6; 
Swainson’s thrush study, 112, 115 

Ponderosa pine (Pinus ponderosa): 
Belalp/Spring Mountains, 324, 
25.4; Idaho Southern Batholith, 
683; Rocky Mountains, 109, 113, 
117; Sierra Nevada, 687 

Population, 8, 48-49, 52, 655, 673 

Population density. See Abundance 

Population dynamics, 39-40, 48-49, 
368-69 

Population models, 69-70, 126-29 

Population patch, 670, cs59.4, 59.1 

Population structure, 673 

Population viability analysis. See PVA 
(population viability analysis) 

Populus tremuloides, 109 

Populus trichocarpa, 109 

Porpoises, 66 

Positive Predictive Power (PPP), 278 

Potlatch Corporation (Idaho), 108, 
574 

Power analyses, 58, 357-65, 29.1 

PPP (Positive Predictive Power), 278 

Prairie dogs (Cynomys spp.), 64, 87 

Prairie warbler (Dendroica discolor), 
228 

Precision, 52 

Predicting Species Occurrences: Issues 
of Scale and Accuracy 
(symposium), 35 

Prediction errors, 78, 272-73, 277; in 
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ANN models, 347, 348; in 
California breeding birds study, 
371-72; cost of, 274-75, 280 

Predictions, critical issues for 
improving, 7-21 

Predictive distribution modeling. See 
Distribution models 

Predictor variables. See Environmental 
variables 

Preference. See Habitat preference; 
Resource preference 

Prelogging surveys, 304 

Prescribed burning, 210-11, 249, 16.2 

Presence-absence models, 442-45, 
c538.1—cs38.2, 38.3 

Presence-absence surveys and 
statistics, 37, 57, 59, 61, 745-46; 
for atlas distribution maps, 
391—97; for Australian rare plant 
species, 303-13; for Belalp/Spring 
Mountains alpine species, 319-21, 
325-26, 25.5, 25.3—25.4; for 
biodiversity models, 66-67; for 
breeding birds in California, 369, 
370—71; for breeding birds in 
Idaho, 574, 579; for butternut 
trees, 491-97, 43.2-43.3; for 
Eurasian badger, 257; for forest 
songbird species, 384; for fungal 
habitat model, 483-89, 42.1—42.2; 
for lynx, 654, 655, 657—58, 58.2; 
for marbled murrelet, 357-65, 
29.1-29.2; for northern spotted 
owl, 244-48; for sage sparrow, 
285-88; for Southern Appalachian 
bird species, 609-10, 613-14; for 
Swainson’s thrush, 110, 113, 8.2; 
for wood thrush bias-control study, 
541-42, 48.2-48.4; for yellow- 
billed cuckoo, 183-96 

Presence-only bias, 539, 541-43, 545, 
48.2, 48.1 

Prevalence, 275-76 

Primary point occurrence data, 
619-20, 622 

Principal components analysis. See 
PCA (principal components 
analysis) 

PRISM, 485, 509 

Probabilistic models, 99, 102, 663 

Probability of occurrence: for 
butternut trees, 491—97, 
c$43.2a—cs43.Zb; for forest 


songbird species, 387-89, 34.2; for 
Great Basin butterflies, 509; for 
lynx, 656; for magnolia warbler, 
377-81, 31.1—31.2; for northern 
spotted owl, 244-46, 19.2-19.3; 
for Southern Appalachian bird 
species, 609; for stream fishes, 
521-24; for wood thrush, 529-35, 
cs47.2, 47.2 

Probability theory, 99-101, 104, 7.2 

Proc Discrim, 609 

Proc Logistic, 532, 608 

Proportional Odds Model, 315-26 

Prosopis glandulosa, 135 

Prospective sampling, 274 

Protected species: Eurasian badger, 
255-62 

Protection of Badgers Act, 1992, 255 

Protective status, 66, 233, 235 

Prothonotary warbler (Protonotaria 
citrea), 228 

Pseudo-absences, 537 

Pseudoreplication, 58, 91, 94, 111-12 

Publications, 60; on northern spotted 
owl, 252, 19.5. See also names of 
individual publications 

Puma (Puma concolor), 92-93 

Purple finch (Carpodacus purpureus), 
59595312533 

PVA (population viability analysis), 
664, 673-85, 60.1, cs60.2, 
60.3-60.5, 60.1-60.2 

Pygmy nuthatch (Sitta pygmaea), 580 


QNet for Windows, 347 

QQ-plots (quantile-quantile plots), 
321,252 

Quadratic terms, 110, 115 

Quantile regression, 13, 15, 1.2 

Quantitative data, 315 

Quantitative ecology: history of, 
53-61; need for rigorous logic in, 
25-26 

Quaternary Geologic Atlas, 160 


Raccoon (Procyon lotor), 152 

Rachel Carson National Wildlife 
Refuge (Me.), 421, 36.1, 36.3, 
36.1-36.3, 36.4 

Radar, imaging, 299-300 

RADARSAT, 300 

Radiotelemetry, 70, 225, 653, 654, 
656, 689 


Rainforests, 304 

Ralph, C. J., 740 

Random error, 429 

Random variables, 79 

Range, 52 

Range maps, 291; breeding birds in 
California, 367-75, 30.1-30.2, 
30.1-30.3; fungi, 483-89, 
42.1-42.5 

Ranges, ancestral and colonized: 
brown-headed cowbirds, 219—28 

Ranked models: neotropical migrant 
birds, 581-92; small mammals, 
441-45, cs38.1—cs38.2, 38.3, 38.2 

Rank index, 384-87 

Raphael, Martin G., 664 

Raptors, 66 

Rare species, 18, 66, 295, 419; 
Central Highlands, Victoria, 
Australia, 303-13, cs24.1, 24.1; 
Central Valley, Calif., 88-89; 
Greater Yellowstone ecosystem, 
504, 506 

Rarity effect, 573, 578 

Reach, 520-22, 46.1 

Realized niche, 38 

Reclassification, 338, 340 

Recolonization, 653, 654 

Red-breasted nuthatch (Sitta 
canadensis): Alberta, 560, 567; 
Idaho, 580; Maine, 594, 595, 604, 
50.1, 50.5-50.6, 53.1-53.3 

Red-cockaded woodpecker (Picoides 
borealis), 210-11, 16.2, 60.1 

Red crossbill (Loxia curvirostra), 580 

Red deer (Cervus elaphus), 79, 336 

Red-eyed vireo (Vireo olivaceus), 228; 
Idaho, 580; Maine, 595, 53.1-53.3 

Red fir (Abies magnifica), 687 

Red-naped sapsucker (Sphyrapicus 
nuchalis), 579 

Red-tailed hawk (Buteo jamaicensis), 
572 

Red-winged blackbird (Agelaius 
phoeniceus), 228, 352, 580, 655, 
664 

Redwoods (Sequoia sempervirens), 
232, 2335235, 2367238. 09052) 
18.2 

Regional densities, 92-93 

Regional forest agreements (RFA), 
303; Central Highlands, Victoria, 
Australia, 309, 313, cs24.1 


Regional planning: Central Highlands, 
Victoria, Australia, 305; Santa 
Cruz County (Calif.), 229-39 

Regional-scale exploratory studies: 
Rocky Mountains, 107-19; 
Southern Appalachian Assessment, 
607-15 

Regional surveys, power for, 360, 
362-65 

Regression analysis, 28, 29—30, 77, 
142, 348; history of, 54, 56, 57, 
58; for model-based maps, 369-71, 
30.2, 30.3 

Regression modeling, 107-19, 411, 
8.2, 35.1. See also Autologistic 
regression modeling; Linear 
regression, simple and multiple; 
Logistic regression models; Ordinal 
regression models 

Regulators, 12-13 

Reiners, William A., 462 

Relative operating characteristic 
curve. See ROC (relative operating 
characteristic) curve” 

Relevé, 316-17; Mojave Desert, 134 

Reliability assessment, 128-29, 131, 
205-7, 295 

Relic environments, 94-95 

Remote sensing imagery, 57, 60, 301; 
Belalp area (Switzerland), 318; 
Greater Manchester, U.K., 260—61; 
Greater Yellowstone ecosystem, 
499—501; Great Plains, 299-300; 
Mojave Desert, 133; Southern 
Appalachian Assessment, 608, 54.1 

Renormalization, 294 

Reptile surveys, 419-27, 36.2, 36.3 

Rescaling relationships, 293-94 

Researcher-manager gap, 57 

Resolution, 9-10, 52 

Resource abundance, 50, 52 

Resource aggregations, 87-88, 89 

Resource availability, 50, 52 

Resource gradients, 74 

Resource preference, 52 

Resource preference assessments, 
55-57 

Resources, 45, 50, 52, 669, 673, 
cs59.2, cs59.4; limiting, 12-16 

Resource selection, 52 

Resource specialists, 243-44, 19.4 

Resource use, 50, 52 

Restoration opportunities models: 


Everglades-South Florida, 467-74, 
40.1; Santa Cruz County (Calif), 
230, 232-33, 235, 18.1 

Restudy, 471-72 

Resubstitution, 274, 278 

RFA (regional forest agreements), 
303; Central Highlands, Victoria, 
Australia, 309, 313, cs24.1 

Rhone Valley, 316 

Ridge and Vally Province (Pa.), 336 

Ringtail cat (Bassariscus astutus), 152 

Riparian-associated species, 66, 235; 
Abert's towhee, 720, cs63.1, 
63.1-63.2; Apache silverspot 
butterfly, 514; Australian rare 
plant species, 304; Black-throated 
sparrow, 720, cs63.1, 63.1-63.2; 
brown-headed cowbirds, 220; 
yellow-billed cuckoo, 183-96; 
Yellow warbler, 720, cs63.1, 
cs63.7, cs63.9, 63.1-63.2 

River Vista Unit (Merrills Landing, 
@alitt), 195, 190; 195 

Roadside counts and surveys, 220, 
346, 347; controlling bias in, 540, 
545, 48.1, 48.4 

Robertsen, Margaret J., 268 

Rock wren (Salpinctes obsoletus), 580 

Rocky Mountain Herbarium (RM), 
484 

Rocky Mountains, 107 

ROC (relative operating 
characteristic) curve, 276-77, 21.2; 
Australian rare plant species, 308; 
Belalp/Spring Mountains alpine 
species, 320, 324, 25.5; forest 
songbird species, 385, 387, 32.1; 
lynx, 655; Swainson's thrush 
habitat study, 111-12, 113, 
8.3-8.5 

ROD (record of decision), 475, 476, 
479 

Rolling Red Plains ecoregion (Okla.), 
345-47, 354-55, 28.1-28.5, 
28.1-28.2 

Roloff, Gary J., 664 

Romesburg,.H. C., 57-58 

Rose-breasted grosbeak (Pheucticus 
ludovicianus), 228, 595, 53.1-53.3 

Rotenberry, John T., 267, 739, 744 

Rubber boa (Charina bottae), 18.1 

Ruby-crowned kinglet (Regulus 
calendula), 580 
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Ruffed grouse (Bonasa umbelius), 579 

Rufous hummingbird (Selasphorus 
rufus), 579 

Rules-based modeling, 480, 41.1 

Rule-set method of bias control, 
541-42, 546 


Sacramento River Basin, 468 

Sacramento River (Calif.), 183-96, 
14.1-14.5, cs14.6 

Sacramento River Conservation Zone 
(SRGZ); 195 

Sagebrush (Artemisia tridentata), 285, 
746 

Sagebrush community, 501, 503 

Sage sparrow (Amphispiza belli): 
Idaho, 285-88, cs22.1—cs22.2, 
22.33.6822 49221-2223; 
Oregon/Nevada, 745, 746, 65.1 

Salamader niche relationships, 49 

Salmonids, 327-34, 26.1 

Samango monkey (Cercopithecus 
mitis), 60.1 

Sample region. See Extent 

Sample resolution, 8 

Sampling, 75-76 

Sampling bias, 539 

Sampling design, 632-35 

Sampling intensity, 143, 505 

Sampling response variables, 318 

Sampling sizes, 143-49; in breeding 
birds (Idaho) study, 573-74, 
51.2-51.3; in Monte Carlo 
simulation experiment, 379-81, 
31.1-31.2, 32.2-32.3 

Sampling, spatial and temporal issues 
in, 7-11 

Sampling units, power for, 360-65, 
29.1, 29.1-29.2 

Sandhill crane (Grus canadensis), 60.1 

Sandhill soil types, 235; Santa Cruz 
County (Calif.), 233 

San Diego State University, 134, 691 

San Pedro River (Ariz.), 718-25 

Santa Cruz County (Calif.), 229-39, 
18.1, c$18.2—cs18.3, 18.1-18.2 

Santa Fe Institute, 552 

Santa Rosalia, 749 

Sargeant, Glen A., 267 

Sarsaparilla (Aralia nudicaulis), 560 

Satellite imagery: Greater Manchester, 
U.K., 256; Greater Yellowstone 
ecosystem, 500-501; Mojave 
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Desert, 133; Oregon, 431, 432, 
433; South Dakota, 441-42; St. 
Louis and Lake Counties (Minn.), 
414; United States, 220 

Savannah sparrow (Passerculus 
sandwichensis), 228, 504, 580 

Scale, 7-8, 52, 125-26, 202 

Scale, mapping, 8, 44-45 

Scaled quail (Callipepla squamata), 
354 

Scale of observation, 52, 125-32, 
153-54, 251, 746-47; in green 
woodpecker study, 197-204; in 
stream-fish study, 519-27, 46.2, 
46.1-46.2 

Scaly chanterelle (Gomphus bonari), 
41.1 

Scaly chanterelle (Gomphus 
kauffmanii), 41.1 

Scarlet tanager (Piranga olivacea), 228 

Sceloporus poinsettii, 650 

Schaafsma and van Vark's rule, 278 

Schaefer, Sandra M., 267 

Schoodic Point, Maine, 172, 13.1 

Schreuder, H. T., 363 

Scotland, 203 

SDSS (spatial decision support 
systems), 229-30 

Seabirds, breeding, 171-81 

Search effort, 393-94 

SEA (Special Emphasis Area), 702, 
704 

Semiquantitative response models, 
315-26 

Semivariogram analysis, 142-51, 
11.1—11.7; Oregon vertebrate 
species richness study, 434-40, 
37.2-37.5 

Sensitivity analysis, 52, 206-8, 
276—77; biodiversity conflict study, 
234; New Mexico vertebrates, 
646-47, 649; Southern 
Appalachian bird species, 609; 
Swainson's thrush study, 113 

SESI (Spatially Explicit Species Index), 
471-72 

Setts, badger, 255-62, 20.1, 20.1-20.3 

Shad-scale (Atriplex confertifolia), 
285 

Shannon-Weaver index of topographic 
complexity, 532, 533, 534 

Shapiro, Ann-Marie, 463 

Shapiro-Wilk W zest, 135 


Sharp-shinned hawk (Accipiter 
striatus), 236, 579, 18.1 

Sharp-tailed grouse (Tympanuchus 
phasianellus), 419 

Shenandoah National Park, 523, 46.1 

Shriner, Susan A., 462 

Sideoats grama (Bouteloua 
curtipendula), 441 

Sierra National Forest, 687-700, 61.1 

Sierra Nevada, 153, 238, 510, 665 

Sierra Nevada Ecosystem Project, 241 

Significant Natural Areas, 235 

Simons, Theodore, 463 

Simpson’s diversity index, 141, 561, 
583 

Simulation analyses of artificial neural 
network model, 349, 352, 
28.3-28.5 

Simulation models, 212-16; 
Everglades-South Florida, 447-58, 
39.2-39.5, 39.2-39.6; GIS and, 
298; Idaho, 285-88, 
cs22.1—cs22.2, cs22.4, 22.1-22.3; 
Monte Carlo experiments, 377-81, 
414-18, 35.5. See also Individual- 
based simulation models; Spatially 
explicit simulation models 

Singing-ground surveys, 336-43, 
27.1-27.2, 27.1-27.2 

Single-factor response reversals, 16-19 

Sink populations, 52 

Sirococcus clavigigenti- 
juglandacearun, 491 

Sisk, Thomas D., 664 

Sic 

Site-specific explanatory variables, 
157-69, 12.3-12.6 

Sitta pygmaea, 650 

Siuslaw National Forest, 728 

Six Rivers National Forest (Calif.), 
359 

Skewed £-functions, 78 

Small mammals, 441-45 

Small-scale studies, 44 

Smallwood, K. Shawn, 29, 30, 673 

SME (Spatial Modeling Environment) 
software, 549 

Smith, Vickie J., 267 

Smooth brome (Bromus inermis), 441 

Smoothing methods, 267, 
cs33.1—cs33.6, 33.1; for atlas 
distribution maps, 392-97 


Snail kite (Rostrhamus sociabilis), 
471, 472, 60.1 

Snake River Birds of Prey National 
Conservation Area, 285, 
cs22.1—cs22.2, cs22.4 

Snake River Valley, 574 

Snowy egret (Egretta thula), 370 

Sociology, 742 

Soil Conservation Service (U.S.), 158 

Soil Survey and Land Research 
Centre, U.K., 256 

Soil Survey Geographic database 
(SSURGO), 300 

Soil types, 233, 256; Great Plains, 300 

SOLARFLUX model, 319 

Somers’ dyx. See dyx statistic 

Song sparrow (Melospiza melodia), 
228; Idaho, 576, 580, 51.4—51.5; 
Maine, 598, 53.1-53.3; Pacific 
Northwest, 731, 64.1—64.3, 
64.1-64.3 

Sooty shearwaters (Puffinus griseus), 
60.1 

Source populations, 52 

Source-sink habitat structure, 68, 81 

South Dakota, 441-45 

Southern Appalachian Assessment, 
607-10, 54.1 

Southern Appalachian Assessment 
database, 492 

Southern beech (Nothofagus), 76, 79 

Southern California Natural 
Communities, 241 

Southern pine beetle (Dendroctonus 
frontalis), 530 

South Florida Management District, 
471 

South Platte River, Colo., 101 

South Science Center (USGS), 582 

Spatial Analyst, 187 

Spatial autocorrelation, 429-40; in 
American woodcock habitat use 
study, 335-43; in Australian rare 
plant species, 313; in Australian 
vegetation and fauna study, 79, 81; 
in breeding seabird study, 180; in 
Calling Lake bird species study, 
565, 568 

Spatial decision support systems 
(SDSS), 229-30 

Spatial errors, 495-96 

Spatial heterogeneity, 141 

Spatially explicit simulation models: 


Calling Lake Study Area (Alberta), 
559; Everglades-South Florida, 
447-58, 39.2-39.5, 39.2-39.6; 
Pacific Northwest, 701-12, 62.5, 
62.3—62.5; San Pedro River (Ariz.), 
718-25 

Spatially Explicit Species Index (SESI), 
471-72 

Spatial models, 296-97; neotropical 
migrant birds, 581-92 

Spatial scale, 141-55, 11.2; in 
sampling, 7-11; validation of 
models and, 36-37, 206. See also 
Multiscale modeling 

Spatial smoothing, 392, cs33.1-cs33.6 

Spatial statistical methods, 434 

Spatial variability, 664, 714—15 

Spatiotemporal data in GIS, 291, 
297-98 

Spearman's Rho, 135, 161, 201, 421 

Special Emphasis Areas (SEA), 702, 
704 

Special features models: Santa Cruz 
County (Calif.), 230, 233, 235-37 

Specialists, 593; neotropical migrant 
birds, 584, 590, 52.2; Southern 
Appalachian bird species, 608 

Species, 665 

Species Analyst, The, 619 

Species assemblage, 47 

Species distribution models. See 
Distribution models 

Species-energy relations, 40 

Species-environment relations, 
foundations of, 35—41 

Species-habitat correlations, 67-69; 
Great Basin butterflies model, 511; 
montane meadows study, 499-506, 
44.3 

Species habitat models. See Habitat 
association models 

Species richness: eucalypt forests, 79; 
Great Basin, 507-17; Oregon, 
432-40, 37.1-37.2; United States, 
222, 17.2-17.3 

Specificity in modeling, 64, 4.1 

Speyeria mormonia, 504 

Speyeria nokomis apacheana, 514-16, 
45.2 

S-Plus (Mathsoft), 220, 319, 337 

SPOT satellite imagery, 500; Mojave 
Desert, 133 

Spotted owl (Strix occidentalis), 66, 


153, 154, 357, 363. See also 
California spotted owl (Strix 
occidentalis occidentalis); northern 
spotted owl (Strix occidentalis 
caurina) 

Spotted sandpiper (Actitis macularia), 
5958 

Spotted towhee (Pipilo maculatus), 
576, 580, 51.4—51.5 

Spread error, 389 

Spring Mountains (Nevada), 316-19, 
2 ul; 2nd 

Spring peeper (Pseudacris crucifer), 
160-61, 163, 164-65, 167, 12.1, 
12.3-12.4, 12.7 

Spruce (Picea engelmannii), 109, 113 

SR? survey strategy, 75-76, 5.3 

STATA, 565 

Statistical bias, 538 

Statistical methods, 53-61, 3.1 

Statistical modeling, 76-79, 5.4 

Statistical tools and techniques, 28-30 

Statistix, 348 

STATSGO database (State Soil 
Geographic), 300 

Stauffer, Dean F, 25-26, 27, 291, 
358, 739, 740 

Stauffer, Howard B., 267 

Steger, G. N., 689 

STELLA modeling software, 549, 
49.2 

Steller's jay (Cyanocitta stelleri), 579 

Stenotopes. See Specialists 

Stepwise regression, 77, 174 

Stepwise selection methods, 111, 199 

St. Louis County (Minn.), 414, cs35.3 

St. Michael, 749 

Stockwell, David R. B., 463 

Stopover habitats, 581-92 

Storage media, 8, 54 

Strategic Environmental Research and 
Development Program, 134 

Stream fishes, 519-27, 46.1—46.2, 
46.1-46.2 

Striated thornbill (Acanthiza lineata), 
159 

Student's t-test, 348, 350 

Study area, 52 

Sturnella magna, 650 

Suaeda moquinii, 135 

Subalpine fir (Abies lasiocarpa), 682 

Subpopulations, 673, 685 

Subspecies, 665 
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Sugar maple (Acer saccharum), 402 

Sumatran rhino (Dicerorbinus 
sumatrensis), 60.1 

Sum-of-squares criterion, 349-50, 
352, 28.1 

Sunkhaze Meadows National Wildlife 
Refuge (Me.), 36.1, 36.3, 
36.1-36.2, 36.4 

Survey designs, 75-76, 357-58; for 
American woodcock habitat use, 
336-43; for amphibian habitat 
models, 158-60, 12.1; for bird- 
habitat relationship studies, 108-9, 
118, 189; for landbird monitoring 
programs, 107-9; for marbled 
murrelet, 358-59, 363; for 
northern bobwhite ANN model, 
346; for salmonid patch-based 
study, 329; for small mammal 
study, 442; for spotted owl, 363 

SURVIV, 730, 732-33, 736 

Swainson’s thrush (Catharus 
ustulatus): Idaho, 580; Pacific 
Northwest, 731, 735, 64.1-64.3, 
64.1-64.3; Rocky Mountains, 
108-19, 8.2, 8.6, 8.2 

Swamp sparrow (Melospiza 
georgiana), 228 

Swarm (individual simulation 
environment), 548, 552-53 

Swiss Alps, 316-19, 25.1 

Swiss Federal Office of Topography, 
318 

Swiss Jura Mountains, 653-59, 58.1, 
cs58.2 

Switzerland, 197-204, 15.1 

Sylvilagus nuttallii, 650 

Synapses, 346-47 

Systemic bias, 273 

Systemic errors, 429 

Systems, 45 


tau coefficient, 277 

TCI (topographic convergence index), 
§31 

Temporal dimensions of geospatial 
data, 291-302 

Temporal scale, 36; in GIS, 296-98; in 
sampling, 7-11 

Temporal variability, 663-64; 
breeding seabird habitat 
occupancy, 171-81; yellow-billed 
cuckoo habitat quality, 183-96 
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Tennessee, 221, 607-15 

Tennessee Valley Authority reservoirs, 
530 

Tennessee warbler (Vermivora 
peregrina), 32.1-32.2 

Terminology, standard, 43-50, 124 

Terrain shape index (TSI), 532 

Terra satellite, 301 

Territory, 52 

Testing data sets, 129, 274, 21.3 

Tetratheca ciliata, 312 

Tetratheca stenocarpa, 310, 311, 312, 
24.1 

Texas, 221, 582 

Texas Parks and Wildlife Department, 
355 

Thamnophis elegans, 650 

Thematic error, 429-30 

Thematic Mapper Simulator (TMS), 
718, cs63.1, cs63.4—cs63.9 

Theobald, David M., 665 

Theoretical ecology, 11-12, 77-78, 81 

Theoretical models, 63, 64, 4.1 

Theoretical null pattern, 85-87, 6.3 

Thomomys umbrinus, 650 

Thoreau, Henry David, 739 

Three-toed woodpecker (Picoides 
tridactylus), 579 

Threshold area, 89 

Thresholds, 276-80, 322-24, 21.2, 
21.4, 25.3 

TIGER (U.S. Census Bureau), 160 

Timber harvesting: Baraboo Hills 
ecosystem (Wis.), 403, 34.1, 34.5; 
Calling Lake Study Area (Alberta), 
560, 568; Idaho Southern 
Batholith, 682; Klamath Province, 
243, 249; Pacific Northwest, 701; 
Santa Cruz County (Calif.), 232, 
235 

Time-allocation hypothesis, 736 

Time-based representations, 297 

TMSURVIV, 727-28, 730, 732-33, 
736 

Toiyabe Range, 511, 514-16 

Tools, 266-67, 740-41, 743 

Topographic convergence index (TCI), 
531 

Topographic relative moisture index 
(TRMI), 532 

Toquima Range, 514 

Townsend’s ground squirrel 
(Spermophilus townsendii), 745 


Townsend's solitaire (Myadestes 
townsendi), 580 

Townsend’s warbler (Dendroica 
townsendi), 580, 731, 64.1-64.3 

Training, 274, 347 

Training cases, 273 

Training data sets, 274, 21.2, 21.3; 
Belalp/Spring Mountains alpine 
species, 320; golden eagle nest site 
study, 278; Spring Mountains 
ground cover density, 318 

Training parameters, 348 

Trani, Margaret Katherine (Griep), 
293 

Transect surveys: Great Basin, 511; 
Rocky Mountains, 108, 8.1-8.2 

Transient bird species, identification 
of, 727-36 

Transmutation errors, 293 

Trapping: brown-headed cowbirds, 
548, 550-51, 49.1; small 
mammals, 442, 445, 38.1 

Treefrog, 153 

Trembling aspen (Populus 
tremuloides), 560 

Trifolium alinum, 25.4 

Trifolium alpinum, 324, 25.2-25.3, 
25.2 

Trimble Geoexplorer GPS, 318, 531 

TRMI (topographic relative moisture 
index), 532 

Trout (Salmo trutta), 352. 

Trowbridge's shrew (Sorex 
trowbridgii), 18.1 

TSI (terrain shape index), 532 

t-tests, 55, 104, 3.1 

Tundra vole (Microtus oeconomus), 
153 

Tweaking, 539 

Tympanuchus pallidicinctus, 650 

Type I and H errors. See Errors of 
commission and omission 


Umatilla National Forest, 728 

Umbrella species, 67 

Uncertainty, structural, of models, 
208-10, 472, 632 

Understory cover: Calling Lake bird 
species study, 560; forest songbird 
species studies, 401, 34.3-34.4; 
Swainson's thrush study, 113, 117, 
8.6 


Unimodal diversity-productivity curve, 
16-19 

United States: brown-headed cowbird 
study, 219-20, cs17.1; wood 
thrush bias-control study, 540, 
48.1—48.4. See also individual 
departments and bureaus 

Universal Transverse Mercator system, 
501 

University of Maryland, 549 

Upper Midwest Gap Analysis 
Program, 160 

Urban growth models: Santa Cruz 
County (Calif.), 229-30, 233, 234, 
296, 237-38; 18.1, cs18.3 

Urbanized environments: Eurasian 
badger study, 255-62 

UTM (universal transverse mercator), 
559 


Validation of models, 52, 64—65, 69, 
206-8, 447-48, 464-65; 
Australian rare plant species, 
308-9; butternut trees, 491-97; 
Calling Lake bird species, 569; 
Eurasian badger study, 257-62; 
forest songbird species, 402; Great 
Basin butterflies, 509, 513-14; 
green woodpecker habitat study, 
203; Maine land bird study, 
598-601; northern bobwhite ANN 
model, 348; Southern Appalachian 
bird species, 607-15; Swainson's 
thrush study, 111; wood thrush 
habitat study, 533; yellow-billed 
cuckoo, 185-87 

Valley oak (Quercus lobata), 188 

Van Horne, Beatrice, 26, 27, 28-29, 
329s 1d 

Van Manen, Frank T., 462 

Variable populations, modeling of, 
126-29, 9.1 

Variable selection, 743 

Varied thrush (Ixoreus naevius), 109, 
580 

Vaux’s swift (Chaetura vauxi), 236, 
SOIN 

Veery (Catharus fuscescens), 152, 228, 
11.2; Idaho, 580; Maine, 595, 
53.1-53.3; Southern Appalachian 
Assessment, 608, 611-12, 613, 
54.2-54.3; Wisconsin, 404-5, 
34.1, 34.1-34.2, 34.5-34.7 


Vegetation classification, 134 

Vegetation distribution, predicting: 
Mojave Desert, 133-39, 10.1-10.5 

Vegetation maps: Maine, 594-96, 
53.2-53.5 

Vegetation modeling: Mid-Lachian 
Valley, 73-79; Mojave Desert, 
133-39 

Vegetation surveys: Greater 
Yellowstone ecosystem, 501-2; 
Great Smoky Mountains National 
Park, 492-94, 43.1; Mid-Lachian 
Valley, 75-76, 5.1; Mojave Desert, 
133, 10.3; Rocky Mountains, 
108-9, 113, 8.1-8.2, 8.6; South 
Dakota, 441-42, 38.1 

Verification, 52; Bayesian models, 
597-601; spatially explicit 
simulation models, 447-54 

Vermivora celata, 650 

Verner, Jerry, 740 

Vernier, Pierre R., 463 

Vertical densitometer, 187 

Vesper sparrow (Pooecetes 
gramineus), 228, 504 

Viability, 52 

Viability assessment, 67-71 

Victoria, Australia, 303-13 

Violet-green swallow (Tachycineta 
thalassina), 579 

Virginia, 607-15 

Visual encounter surveys, 160 

Vulpes macrotis, 650 


Wading birds, short- and long-legged, 
471, 472 

Wald statistic of predictor variables, 
199 

Wallace, Alfred Russell, 739 

Wallacean islands, 276, 278 

Warbler species, 54 

Warbling vireo (Vireo gilvus), 228, 
576, 580, 51.4—51.5 

Wasatch Range, 510 

Washington Department of Natural 
Resources, 702 

Waterfowl reserves, 588-89 

Waterfowl surveys, 663 

Watersheds, 26.5; in amphibian 
habitat study, 157-69, 12.1, 12.1; 
in salmonid patch-based study, 
330; in stream-fish study, 520-21, 
523-24, 46.1 


Weather data and analyses: 
Everglades-South Florida, 450; 
Monte Carlo simulation 
experiment, 378; northern 
bobwhite ANN model, 347, 350, 
354, 28.3-28.4 

Wenatchee National Forest, 728 

Western banded gecko (Coleonyx 
variegatus), 647, 650, 57.1—57.2 

Western bluebird (Sialia mexicana), 
580 

Western Cascades, cs19.1, 19.2 

Western flycatcher (Empidonax 
occidentalis). See Cordilleran 
flycatcher; Pacific-slope flycatcher 

Western gull (Larus occidentalis), 370 

Western hemlock (Tsuga 
heterophylla), 109, 113 

Western Klamath, cs19.1, 19.2 

Western larch (Larix occidentalis), 
109, 12ta 

Western meadowlark (Sturnella 
neglecta), 576, 580, 33.1, 
51.4—51.5; North Dakota, 391-97 

Western red cedar (Thuja plicata), 
109, 113 

Western roe deer (Capreolus 
capreolus), 654 

Western tanager (Piranga 
ludoviciana), 580 

Western wheatgrass (Agropyron 
smithii), 441 

Western wood-pewee (Contopus 
sordidulus), 579 

West Indian manatee (Trichechus 
manatus), 468 

Wetlands: Eastern Broadleaf Forest 
Eco-Province, 157-69, 12.1; 
Greater Yellowstone ecosystem, 
501, 503; Santa Cruz County 
(Cali. 232 

White, Gilbert, 739 

White-ankled mouse (Peromyscus 
pectoralis), 648 

White-breasted nuthatch (Sitta 
carolinensis), 580 

White chanterelle (Cantharellus 
subalbidus), 479, 41.1 

White-crowned sparrow (Zonotrichia 
leucophrys), 503, 580, 18.1 

White-eyed vireo (Vireo griseus), 228 

White fir (Abies concolor), 687 
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White-footed mouse (Peromyscus 
leucopus), 85 

White-headed woodpeckers (Picoides 
albolarvatus), 680-84, 60.1, 
cs60.2, 60.3-60.4, 60.2 

White Mountains National Forest 
(Me.), 420, 421, 36.1, 36.3, 
36.1-36.2, 36.4 

White Oak Run, 523, 46.1 

White spruce (Picea glauca), 560 

White-tailed deer (Odocoileus 
virginianus): ATLSS models, 471, 
472; spatially explicit simulation 
models, 450-58, 39.1, 39.5, 39.1 

White-tailed eagle (Haliaeetus 
albicilla), 60.1 

White-tailed ptarmigan (Lagopus 
leucurus), 647, 648, 650 

White-throated sparrow (Zonotrichia 
albicollis), 228; Alberta, 560, 565, 
567, 50.1, 50.5-50.6; Maine, 595, 
53.1-53.3 

White-throated swift (Aeronautes 
saxatalis), 579 

White-winged crossbill (Loxia 
leucoptera), 580 

Whooping crane (Grus americana), 
60.1 

WHR (wildlife habitat relationships). 
See Bird-habitat relationship 
studies; Fish-habitat relationship 
studies 

Wiens, John, 30 

Wilcoxon rank-sum test, 55, 3.1 

Wild boar (Sus scrofa), 60.1 

Wildebeest (Connochaetes taurinus), 
60.1 

Wildfires: Idaho, 285-86, 
cs22.1-cs22.2, cs22.4 

Wildlife 2000: Modeling Habitat 
Relationships of Terrestrial 
Vertebrates, 57, 123, 740, 741, 
742, 744, 746 

Wildlife 2000: Modeling Habitat 
Relationships of Terrestrial 
Vertebrates (international 
symposium), 35, 57 

Wildlife habitat assessment: influence 
of spatial scale on, 151-55, 11.2 

Wildlife habitat relationships. See 
Bird-habitat relationship studies; 
Fish-habitat relationship studies 

Wildlife surveys, 414 
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Wild turkey (Meleagris gallopavo), 
5979 

Willamette National Forest, 478, 728 

Williamson’s sapsucker (Sphyrapicus 
thyroideus), 579 

Willow flycatcher (Empidonax 
traillii), 228, 579 

Willow (Salix spp.), 184, 187-88, 
195, 560 

Wilson Landing (Calif.), 185 

Wilson's warbler (Wilsonia pusilla), 
580, 736, 64.1-64.3 

Winter chanterelle (Cantharellus 
tubaeformis), 478, 41.1 

Winter distribution, 221 

Winter fat (Krascheninnikovia lanata), 
285 

Winter wren (Troglodytes 
troglodytes): California, 236, 18.1; 
Idaho, 580; Maine, 598, 53.3, 
53.1-53.3; Pacific Northwest, 731, 
64.1-64.3 

Wisconsin, 158, 399—410, 12.1 

Wisconsin Wetland Inventory, 158, 
160 

Wittstenia vacciniacea, 311, 312, 24.1 

Wolves, 66, 399 


Wood frog (Rana sylvatica), 160—61, 
163, 165, 12.1--12.2, 12.7 

Woodson Bridge State Recreation 
Area (Calif.), 185 

Wood thrush (Hylocichla mustelina), 
228; Great Smoky Mountains 
National Park, 529-35, cs47.2, 
47.1-47.2; Maine, 622, cs55.1; 
Piedmont National Wildlife 
Refuge, 210-11, 16.2; United 
States, 540, 48.2—48.4 

World-views, 85, 87, 93, 6.4 

World Wide Web, 396, 537 

Worm-eating warbler (Helmitheros 
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ENVIRONMENTAL MANAGEMENT / WILDLIFE SCIENCE 


Advance praise for Predicting Species Occurrences 


"The need for land managers to predict species occurrence relative to habitat change has mushroomed 
in response to the National Environmental Policy Act, the Endangered Species Act, and U.S. Forest — 
Service planning regulations. Predicting Species Occurrences is the most thorough treatment yet of this a 
planning and assessment approach, particularly as related to the greatly enhanced approaches to accu- 
racy and scale considerations. This book is a dramatic leap forward.” : 
— Jack Ward Thomas, Boone and Crockett Professor of Wildlife Conservation, University of. 
Montana, and chief emeritus, U.S. Forest Service 


"The successor to Wildlife 2000, this excellent book on current theory and methods for predicting the 

distributions of wild feries will be a sourcebook for ecologists, wildlife biologists, and DICE Cera 

for many years to come.’ | 
—Frank W. Davis, Donald Bren School of Environmental Science and Management 


"Conservation and resource management depends on knowing where species are and in what num- 
bers. For the first time ever, we have a book that synthesizes the scientific foundation for predicting ~ 
species occurrences. All conservation biologists, wildlife ecologists, fisheries biologists, and forest biol-. 
ogists need to have this book in their library. It would even be a good idea for theoretical ecologists to 
own this book, because it may inspire them to advance ecological theory in a manner that would help | 
science illuminate a sustainable future." | 


— Peter Kareiva, lead scientist, The Nature Conservancy 


Predicting Species Occurrences addresses concerns of where different species are, where they are not, 

and how they move across a landscape or respond to human activity. It highlights for managers and 
researchers the strengths and weaknesses of current approaches, as well as the magnitude of the 
research required to improve or test predictions of currently used models. The book offers important 
new information and will be the standard reference on this subject for years to come. Its state-of-the- . 
art assessment will play a key role in guiding the continued development and application of tools for 
making accurate predictions. | 


J. Michael Scott is a research biologist with the U.S. Geological Survey and professor of wildlife biol- 
ogy at the University of Idaho. Patricia J. Heglund is affiliate assistant professor in the Department of 
Biological Sciences at the University of Idaho and regional biologist for National Wildlife Refuges in. 
Alaska. Michael L. Morrison is manager of the White Mountain Research Station for the University > 
of California, Bishop, and adjunct professor of biology at the University of Arizona and at Sacramento | 
State University. | 
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